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We, the inventors of the above-indeed patent apphcation, hereby declare as follows: 



1. 



We have reviewed U.S. Patent Application No. 09/193,058 (the "Subject Apphcation"), 
which is anachea in exhibit A as originally fled on November 16, 1998, and U.S. Patent 
Application No. 08/666,757, (the "Parent Application-), which is attached in exhibit B as 
fled June 19, 1996. The Parent Application issued on April 24, 2001 as U.S. Patent No. 



6,222,927. 



2: we have each reviewed claims 34, 35, 37-42, 44-48, 50-54, and 56-66 that have been or 
win be proposed for the Subject Application (the "Paten, Claims",, a copy of which is 

attached as exhibit C. 

3 We have compared the Patent Clauns of exhtbit C to the text of the Parcnt Application of 
exhibit B and the text of the Subject Application of Exhibit C. Based on this comparison, 
the Parent Apphcation describes in writing the mventions defined by the Patent Claims in 
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4. 



te text — .0 the Subject Applied Specifically, the written description of the 
Parcm Application appearing on pages 1-3 is substantia.* the same as the written 
aescription of the Subject Application appearing on page 1. line 13 - page 3, ,ine 17; the 
WI i„e„description oftheParent Application appearing on pages 8-17 is substantia.* the 
same as the written description of the Subject Application appearing on page 9 - page 17, 
line 27- and the written description of the Parent Application appearing on pages .8-19 
(Experimental Section, is substantially the sante as the written description of the Subject 
Application appearing on page 37 (Example One). 

Research directed to extracting a desire, acoustic signal from a noisy acoustic 

Sep K mberl995andc CT «in„e<.a,lea,in.o«heyea r 1996asevide„ce d bysecUonl4 

, , uiKit n Exhibit D includes an Invention Disclosure Report 
(pages 6-7) of the attached exhibit D. bxniDil 

preparedbyme inventors, acover letter dated May 9. 1996, and supplement. 
informaUon for the Invention Disclosure Report, and has been partially redacted. 

Based on informal and belief, the cover letter date, May 9, 1996 forwarded the 
^enUonDisclosureReporttodte law frmtha, prepared and filed the Parent 
A pp,icati„n and the Subject Application. Such firm subsequently received the 
supplemental information of exhibit D before June 1996. 

.hibitDflnCudingthatreferencedinsectionl^cannotbelocateddueto inadvertent 



5. 
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destruction or loss in connection with one or more equipment upgrades in years 
subsequent to the filing of the Parent Application. 



7. The subtractive processing algorithm, accompanying mathematical formulae, and other 
aspects set forth in exhibit D were established at least as early as September 15, 1995. 
No later than this date, we formed in our minds a definite and permanent idea of the 
complete and operative inventions defined by the methods of the Patent Claims with the 
establishment of these concepts. We confirm our recollection of this timing by its 
chronological relationship to the information set forth in the documents of Exhibit E that 
all existed prior to September 1995. 

8. From before September 18, 1995 through filing of the Parent Application on June 19, 
1996, the inventors have diligently continued research, development, evaluation, and 
experimentation regarding the inventions defined by the Patent Claims. Such activity 
before September 18, 1995 is supported by at least the information set forth in exhibit D. 
Also, in exhibit D note the formation of one corresponding research team August of 1994 
(section 14(a) followed by an initial written record September of 1995 (section 14 (b)). 
This initial record was prepared by inventor Chen Lui shortly after joining the research 
effort during late August of 1995 as corroborated by certain entries of exhibit E attached. 

9. After September 18, 1995, activities continued with the preparation of a detailed research 
initiative proposal dated November 29, 1995, in which the subject matter of exhibit E 
corresponds to the text with the heading "aim #1." A copy of a draft of this proposal is 
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provided in exhibit F. 



10. Before February 9, 1996, experimental activities included computer simulation of the 
processes described in at least independent claims 34, 46, 61, and 62 of the Patent 
Claims, as explained in section 7 (pages 4-5) of the Invention Disclosure Record of 
exhibit D. These efforts are also discussed in the communication to inventor, Dr. Albert 
Feng, on February 12, 1996 from Lynn Huerta, which is included as exhibit G. To those 
skilled in the art to which the inventions of the Patent Claims pertain, this type of 
simulation establishes performance of the corresponding inventions in the intended 
manner. 



1 1 . Before forwarding the Invention Disclosure Report of exhibit D to counsel on or about 
May 9, 1996, testing of an experimental prototype was conducted further establishing 
performance of the inventions of at least independent claims 34, 46, 61, and 62 of the 
Patent Claims in the intended manner. This experimentation is further detailed in section 
10 of the Invention Disclosure Report of exhibit D. The experimental section of the 
Parent Application (pages 18-19) and the Subject Application (page 37) correspond to 
our experimental activities. 

12. Based on information and belief, counsel that received the Invention Disclosure Report of 
exhibit D promptly reviewed it and arranged an interview by telephone with at least one 
of the inventors, Dr. Albert Feng, to further discuss the information contained therein. 
The supplement to the Invention Disclosure Report of exhibit D was subsequently 
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received by such counsel before June 1996. 



13. Based on information and belief, counsel reviewed the Invention Disclosure Report and 
prepared the Parent Application from the materials provided between May 9, 1996 and its 
filing date of June 19, 1996. 

14. The undersigned, being hereby warned that willful false statements and the like are 
punishable by fine or imprisonment, or both (18 U. S C. 1001), and may jeopardize the 
validity of the application or any patent issuing thereon, declares that all statements made 
of his/her own knowledge are true and that all statements made on information and belief 
are believed to be true. 
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Chen Liu, Inventor 



Date 



Charissa R. Lansing, Investor 



3lS j 9lOQ> 




William D. O'Brien, Jr., Irttentor 





e C. Wheeler, Inventor 



Date 
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DECLARATION UNDER 37 CFR $1.131 



We, the inventors of the above-indicated patent application, hereby declare as follows: 



1. 



We have reviewed U.S. Patent Application No. 09/193,058 (the '"Subject Application"), 
which is attached in exhibit A as originally filed on November 16, 1998, and U.S. Patent 
Application No. 08/666,757, (the "Parent Application"), which is attached in exhibit B as 
filed June 19, 1996. The Parent Application issued on April 24, 2001 as U.S. Patent No. 
6,222,927. 



We have each reviewed claims 34, 35, 37-42, 44-48, 50-54, and 56-66 that have been or 
will be proposed for the Subject Application (the "Patent Claims"), a copy of which is 
attached as exhibit C. 



3. We have compared the Patent Claims of exhibit C to the text of the Parent Application of 
exhibit B and the text of the Subject Application of Exhibit C. Based on this comparison, 
the Parent Application describes in writing the inventions defined by the Patent Claims in 
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the text common to the Subject Application. Specifically, the written description of the 
Parent Application appearing on pages 1-3 is substantially the same as the written 
description of the Subject Application appearing on page 1, line 13 - page 3, line 17; the 
written description of the Parent Application appearing on pages 8-17 is substantially the 
same as the written description of the Subject Application appearing on page 9 - page 17, 
line 27; and the written description of the Parent Application appearing on pages 18-19 
(Experimental Section) is substantially the same as the written description of the Subject 
Application appearing on page 37 (Example One). 

4. Research directed to extracting a desired acoustic signal from a noisy acoustic 

environment, such as that associated with the "cocktail party effect," began prior to 
September 1995 and continued at least into the year 1996 as evidenced by section 14 
(pages 6-7) of the attached exhibit D. Exhibit D includes an Invention Disclosure Report 
prepared by the inventors, a cover letter dated May 9, 1996, and supplemental 
information for the Invention Disclosure Report, and has been partially redacted. 



5. 



Based on information and belief, the cover letter dated May 9, 1996 forwarded the 
Invention Disclosure Report to the law firm that prepared and filed the Parent 
Application and the Subject Application. Such firm subsequently received the 
supplemental information of exhibit D before June 1996. 



6. Based on information and belief, at least some of the records and/or files referenced in 

exhibit D (including that referenced in section 14(c)) cannot be located due to inadvertent 
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destruction or loss in connection with one or more equipment upgrades in years 
subsequent to the filing of the Parent Application. 

The subtractive processing algorithm, accompanying mathematical formulae, and other 
aspects set forth in exhibit D were established at least as early as September 15, 1995. 
No later than this date, we formed in our minds a definite and permanent idea of the 
complete and operative inventions defined by the methods of the Patent Claims with the 
establishment of these concepts. We confirm our recollection of this timing by its 
chronological relationship to the information set forth in the documents of Exhibit E that 
all existed prior to September 1995. 



8. From before September 1 8, 1995 through filing of the Parent Application on June 19, 
1996, the inventors have diligently continued research, development, evaluation, and 
experimentation regarding the inventions defined by the Patent Claims. Such activity 
before September 18, 1995 is supported by at least the information set forth in exhibit D. 
Also, in exhibit D note the formation of one corresponding research team August of 1994 
(section 14(a) followed by an initial written record September of 1995 (section 14 (b)). 
This initial record was prepared by inventor Chen Lui shortly after joining the research 
effort during late August of 1995 as corroborated by certain entries of exhibit E attached. 

9. After September 18, 1995, activities continued with the preparation of a detailed research 
initiative proposal dated November 29, 1995, in which the subject matter of exhibit E 
corresponds to the text with the heading "aim #1." A copy of a draft of this proposal is 
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provided in exhibit F. 



10. Before February 9, 1996, experimental activities included computer simulation of the 
processes described in at least independent claims 34, 46, 61, and 62 of the Patent 
Claims, as explained in section 7 (pages 4-5) of the Invention Disclosure Record of 
exhibit D. These efforts are also discussed in the communication to inventor, Dr. Albert 
Feng, on February 12, 1996 from Lynn Huerta, which is included as exhibit G. To those 
skilled in the art to which the inventions of the Patent Claims pertain, this type of 
simulation establishes performance of the corresponding inventions in the intended 
manner. 



1 1 . Before forwarding the Invention Disclosure Report of exhibit D to counsel on or about 
May 9, 1996, testing of an experimental prototype was conducted further establishing 
performance of the inventions of at least independent claims 34, 46, 61, and 62 of the 
Patent Claims in the intended manner. This experimentation is further detailed in section 
10 of the Invention Disclosure Report of exhibit D. The experimental section of the 
Parent Application (pages 18-19) and the Subject Application (page 37) correspond to 
our experimental activities. 

12. Based on information and belief, counsel that received the Invention Disclosure Report of 
exhibit D promptly reviewed it and arranged an interview by telephone with at least one 
of the inventors, Dr. Albert Feng, to further discuss the information contained therein. 
The supplement to the Invention Disclosure Report of exhibit D was subsequently 
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received by such counsel before June 1996. 



13. Based on information and belief, counsel reviewed the Invention Disclosure Report and 
prepared the Parent Application from the materials provided between May 9, 1996 and its 
filing date of June 19, 1996. 

14. The undersigned, being hereby warned that willful false statements and the like are 
punishable by fine or imprisonment, or both (18 U. S C. 1001), and may jeopardize the 
validity of the application or any patent issuing thereon, declares that all statements made 
of his/her own knowledge are true and that all statements made on information and belief 
are believed to be true. 



Albert S. Feng, Inventor 



Date 




Chen Liu, Inventor 



3a~ . 2-4 Zdof 

Date 
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Charissa R. Lansing, Inventor 



Date 



William D. O'Brien, Jr., Inventor 



Date 



Bruce C. Wheeler, Inventor 



Date 
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COPY 



BINAURAL SIGNAL PROCESSING TECHNIQUES 

CROSS-REFERENCE TO RELATED APPLICATION 

This application is a continuation-in-part of pending United States Patent Application 
Serial No. 08/666,757, filed on June 19, 1996 by the same inventive entity, and 
entitled BINAURAL SIGNAL PROCESSING SYSTEM AND METHOD. 

BACKGROUND OF THE INVENTION 
The present invention is directed to the processing of acoustic signals, and more 
particularly, but not exclusively, relates to the localization and extraction of acoustic signals 
emanating from different sources. 

The difficulty of extracting a desired signal in the presence of interfering signals is a 
long-standing problem confronted by acoustic engineers. This problem impacts the design 
and construction of many kinds of devices such as systems for voice recognition and 
intelligence gathering. Especially troublesome is the separation of desired sound from 
unwanted sound with hearing aid devices. Generally, hearing aid devices do not permit 
selective amplification of a desired sound when contaminated by noise from a nearby 
source — particularly when the noise is more intense. This problem is even more severe 
when the desired sound is a speech signal and the nearby noise is also a speech signal 
produced by multiple talkers (e.g. babble). As used herein, "noise" refers not only to 
random or nondeterministic signals, but also to undesired signals and signals interfering 
with the perception of a desired signal. 

One attempted solution to this problem has been the application of a single, highly 
directional microphone to enhance directionality of the hearing aid receiver. This approach 
has only a very limited capability. As a result, spectral subtraction, comb filtering, and 
speech-production modeling have been explored to enhance single microphone 
performance. Nonetheless, these approaches still generally fail to improve intelligibility of 
a desired speech signal, particularly when the signal and noise sources are in close 
proximity. 



I 



10 



15 



20 



25 



30 



Another approach has been to arrange a number of microphones in a selected spaual 
reiationship to form a type of directional detection beam. Unfortunately, when tamtod to a 
size practical for heanng aids, beam forming arrays also have limited capac.ty to separate 
signals .ha. are close together - especially if the noise is more intense man the destred 
speech signal. In addition, in the case of one noise source in a less reverberant envtronment, 
the noise cancellation provided by the beam-former varies with the location of the no.se 
source in relation ,0 the microphone array. R.W. Stadler and W.M. Rabinowitz, 

, ti -,-,c^.^ f orH. a rin g Aids. 94 Journal Acoustical Society of Amenca 1332 
(September .993). and W. Soede =, a... Devel 9J! menloIaDi r f c,inn,| Hrarin, tnstrumen. 

.^Technology. 94 Journal of Acoustical Society of America 785 (August 
1993) are cited as additional background concerning the beam form.ng approach. 

Still another approach has been the application of two microphones displaced fiom 
one another to provide two signals ,o emulate certain aspects of the binaural hearing system 
common to humans and many types of animals. Although certain aspects of btologtc 
binaural hearing are no, M.y understood, it is believed that the ability to localize sound 
sources is based on evaluation by the auditory system of binaural time delays and sound 
,evels across different frequency bands associated with each of the two sound signals. The 
localization of sound sources with systems based on these interaural time and tn.ens.«y 
differences is discussed in W. Lindemann, E „^ion of , Binaural Cmss-rnrrelatton Mv-te l 
R y r„n, ra l a .eral Inh ^-j ™ - ■ Emulation of I M en,|i7a.jon for Stationary Signals , 
Journal of the Acoustical Society of America 1608 (December 1986). 

The localization of multiple acoustic sources based on input from two microphones 
presents several significant cha.lenges,.as does the separation of a desired signal once to 
sound sources are localized. For example, the system se, forth in Markus Bodden, 

M ^.,p„ Snun ' ° ■ ™« thr rorHail-Party-Effect , 1 Acta 

Acustica 43 (February/April 1993) employs a Wiener filter including a windowtng process 
in an attempt to derive a desired signal fiom binaural input signals once the location of the 
desired signal has been established. Unfortunately, this approach results in s.grufican. 
deterioration of desired speech fidelity. Also, the system has only been demonstrated <o 
suppress noise of equal intensity to the desired signal a. an azimuthal separatum of a, leas. 
30 degrees. A more intense noise emanaring from a source spaced closer than 30 degrees 
from .he desired source con.inues <o presen, a problem. Moreover, .he proposed algonthm 
of .he Bodden system is computationally intense - posing a serious question of whether . 
can be practically embodied in a hearing aid device. 
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Another example of a two microphone system is found in D. Banks, Localisation and 
Separation of Simultaneous Voices with Two Microphones . EEE Proceedings-I, 140 (1993). 
This system employs a windowing technique to estimate the location of a sound source 
when there are nonoverlapping gaps in its spectrum compared to the spectrum of interfering 
5 noise. This system cannot perform localization when wide-band signals lacking such gaps 
are involved. In addition, the Banks article fails to provide details of the algorithm for 
reconstructing the desired signal. U.S. Patent Nos. 5,479,522 to Lindemann et al.; 5,325,436 
to Soli et al.; 5,289,544 to Franklin; and 4,773,095 to Zwicker et al. are cited as sources of 
additional background concerning dual microphone hearing aid systems. 
10 Effective localization is also often hampered by ambiguous positional information 

that results above certain frequencies related to the spacing of the input microphones. This 
problem was recognized in Stem, R. M., Zeiberg, A. S., and Trahiotis, C. "Lateralization of 
complex binaural stimuli: A weighted-image model/' J. Acoust. Soc. Am. 84, 156-165 
(1988). 

15 Thus, a need remains for more effective localization and extraction techniques - 

especially for use with binaural systems. The present invention meets these needs and 
offers other significant benefits and advantages. 



3 



SUMMARY OF THE INVENTION 



The present invention relates to the processing of acoustic signals. Various aspects of 
5 the invention are novel, nonobvious, and provide various advantages. While the actual 
nature of the invention covered herein can only be determined with reference to the claims 
appended hereto, selected forms and features of the preferred embodiments as disclosed 
herein are described briefly as follows. 

One form of the present invention includes a signal processing technique for 
10 localizing and characterizing each of a number of differently located acoustic sources. 

Detection of the sources is performed with two sensors that are spaced apart. Each, or one 
particular selected source may be extracted, while suppressing the output of the other 
sources. A variety of applications may benefit from this technique including hearing aids, 
sound location mapping or tracking devices, and voice recognition equipment, to name a 
15 few. 

In another form, a first signal is provided from a first acoustic sensor and a second 
signal from a second acoustic sensor spaced apart from the first acoustic sensor. The first 
and second signals each correspond to a composite of two or more acoustic sources that, in 
turn, include a plurality of interfering sources and a desired source. The interfering sources 

20 are localized by processing of the first and second signals to provide a corresponding 
number of interfering source signals. These signals each include a number of frequency 
components. One or more the frequency components are suppressed for each of the 
interfering source signals. This approach facilitates nulling a different frequency 
component for each of a number of noise sources with two input sensors. 

25 A further form of the present invention is a processing system having a pair of sensors 

and a delay operator responsive to a pair of input signals from the sensors to generate a 
number of delayed signals therefrom. The system also has a localization operator 
responsive to the delayed signals to localize the interfering sources relative to the location 
of the sensors and provide a plurality of interfering source signals each represented by a 

30 number of frequency components. The system further includes an extraction operator that 
serves to suppress selected frequency components for each of the interfering source signals 
and extract a desired signal corresponding to a desired source. An output device responsive 
to the desired signal is also included that provides an output representative of the desired 
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source. This system may be incorporated into a signal processor coupled to the sensors to 
facilitate localizing and suppressing multiple noise sources when extracting a desired signal. 

Still another form is responsive to position-plus- frequency attributes of sound sources. 
It includes positioning a first acoustic sensor and a second acoustic sensor to detect a 
plurality of differently located acoustic sources. First and second signals are generated by 
the first and second sensors, respectively, that receive stimuli from the acoustic sources. A 
number of delayed signal pairs are provided from the first and second signals that each 
correspond to one of a number of positions relative to the first and second sensors. The 
sources are localized as a function of the delayed signal pairs and a number of coincidence 
patterns. These patterns are position and frequency specific, and may be utilized to 
recognize and correspondingly accumulate position data estimates that map to each true 
source position. As a result, these patterns may operate as filters to provide better 
localization resolution and eliminate spurious data. 

In yet another form, a system includes two sensors each configured to generate a 
15 corresponding first or second input signal and a delay operator responsive to these signals to 
generate a number of delayed signals each corresponding to one of a number of positions 
relative to the sensors. The system also includes a localization operator responsive to the 
delayed signals for determining the number of sound source localization signals. These 
localization signals are determined from the delayed signals and a number of coincidence 
20 patterns that each correspond to one of the positions. The patterns each relate frequency 
varying sound source location information caused by ambiguous phase multiples to a 
corresponding position to improve acoustic source localization. The system also has an 
output device responsive to the localization signals to provide an output corresponding to at 
least one of the sources. 

25 A further form utilizes two sensors to provide corresponding binaural signals from 

which the relative separation of a first acoustic source from a second acoustic source may be 
established as a function of time, and the spectral content of a desired acoustic signal from 
the first source may be representatively extracted. Localization and identification of the 
spectral content of the desired acoustic signal may be performed concurrently. This form 

30 may also successfully extract the desired acoustic signal even if a nearby noise source is of 
greater relative intensity. 

Another form of the present invention employs a first and second sensor at different 
locations to provide a binaural representation of an acoustic signal which includes a desired 
signal emanating from a selected source and interfering signals emanating from several 
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interfering sources. A processor generates a discrete first spectral signal and a discrete 
second spectral signal from the sensor signals. The processor delays the first and second 
spectral signals by a number of time intervals to generate a number of delayed first signals 
and a number of delayed second signals and provide a time increment signal. The time 
5 increment signal corresponds to separation of the selected source from the noise source. 
The processor generates an output signal as a function of the time increment signal, and an 
output device responds to the output signal to provide an output representative of the 
desired signal. 

An additional form includes positioning a first and second sensor relative to a first 
10 signal source with the first and second sensor being spaced apart from each other and a 
second signal source being spaced apart from the first signal source. A first signal is 
provided from the first sensor and a second signal is provided from the second sensor. The 
first and second signals each represents a composite acoustic signal including a desired 
signal from the first signal source and unwanted signals from other sound sources. A 
15 number of spectral signals are established from the first and second signals as functions of a 
number of frequencies. A member of the spectral signals representative of position of the 
second signal source is determined, and an output signal is generated from the member 
which is representative of the first signal source. This feature facilitates extraction of a 
desired signal from a spectral signal determined as part of the localization of the interfering 
20 source. This approach avoids the extensive post-localization computations required by 
many binaural systems to extract a desired signal. 

Accordingly, it is one object of the present invention to provide for the enhanced 
localization of multiple acoustic sources. 

It is another object to extract a desired acoustic signal from a noisy environment 
25 caused by a number of interfering sources. 

An additional object is to provide a system for the localization and extraction of 
acoustic signals by detecting a combination of these signals with two differently located 
sensors. 

Further objects, features, aspects, benefits, forms, and advantages of the present 
30 invention shall become apparent from the detailed drawings and descriptions provided 
herein. 
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BRIEF DESCRIPTION OF THE DRAWINGS 



FIG. 1 is a diagrammatic view of a system of one embodiment of the present 
invention^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ , f the system of 

FIG ' FIG. 3 is schematic representation of the dual delay line of FIG. 2 

F,GS 4A and 4B depict other embodiments of the present invent™ correspondmg to 

hearing aid and computer voice recognition applications, respecttvely. 

,G 5 is a graph of a speech signa, in the form of a sentence about 2 seconds .on, 
FIG 6 is a graph of a composite signal including babble noise and the speech stgna. 

relative to the speech signal source. ,„ f mr,5after 
FIG. 7 is a graph of a signal representative of the speech s.gnal of FIG. 5 after 

extraction from the composite signal of FIG. 6. 

FIG 8 is a graph of a composite signa. including babble noise and the speech stgna. 
of FIG. 5 a, a -30 dB signaMo-noise ratio with the babble noise source at a 2 degree 
azimuth relative to the speech signal source. 

FIG. 9 is a graphic depiction of a signal representative of the sample speech stgnal of 
FIG 5 after extraction from the composite signal of FIG. 8. 

F,G. 1 1 is a pll. si^a, flow diagram iUustrating selected aspects of the dual delay 

lines of FIG. 1 0 in greater detail. ,.. 

FIG 12 is a diagram iUustrating selected geometric features of the embodtmen. 

FIG. 1 3 is a signa. flow diagram iUustrating selected aspects of the localvzafon 
nnerator of FIG. 10 in greater detail. 

operator o ^ ^ ^ ^ ^ ^ ^ ^ „. 

FIG. 15 is a signal flow diagram further illustrating selected aspects of the 

^"^^^^^^^"^ 

operator of FIG. 1 5 in greater detail. 

FIG. nisagraphillustratingaplotofcoincidencelociforwosources. 



FIG. 18 is a graph illustrating coincidence patterns for azimuth positions 
corresponding to -75°, 0°, 20°, and 75°. 

FIGs. 19-22 are tables depicting experimental results obtained with the present 
invention. 

5 
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DESCRIPTION OF THE PREFERRED EMBODIMENT 



For the purposes of promoting an understanding of the principles of the invention, 
reference will now be made to the embodiment illustrated in the drawings and specific 
5 language will be used to describe the same. It will nevertheless be understood that no 
limitation of the scope of the invention is thereby intended. Any alterations and further 
modifications in the described device, and any further applications of the principles of the 
invention as described herein are contemplated as would normally occur to one skilled in 
the art to which the invention relates. 

10 Fig. 1 illustrates an acoustic signal processing system 10 of one embodiment 

of the present invention. System 10 is configured to extract a desired acoustic signal 
from source 12 despite interference or noise emanating from nearby source 14. 
System 10 includes a pair of acoustic sensors 22, 24 configured to detect acoustic 
excitation that includes signals from sources 12, 14. Sensors 22, 24 are operatively 

15 coupled to processor 30 to process signals received therefrom. Also, processor 30 is 
operatively coupled to output device 90 to provide a signal representative of a desired 
signal from source 12 with reduced interference from source 14 as compared to 
composite acoustic signals presented to sensors 22, 24 from sources 12, 14. 

Sensors 22, 24 are spaced apart from one another by distance D along lateral 

20 axis T. Midpoint M represents the halfway point along distance D from sensor 22 to 
sensor 24. Reference axis Rl is aligned with source 12 and intersects axis T 
perpendicularly through midpoint M. Axis N is aligned with source 14 and also 
intersects midpoint M. Axis N is positioned to form angle A with reference axis Rl. 
Fig. 1 depicts an angle A of about 20 degrees. Notably, reference axis Rl maybe 

25 selected to define a reference azimuthal position of zero degrees in an azimuthal plane 
intersecting sources 12, 14; sensors 22, 24; and containing axes T, N, Rl. As a result, 
source 12 is ^on-axis" and source 14, as aligned with axis N, is "off-axis." Source 14 
is illustrated at about a 20 degree azimuth relative to source 12. 

Preferably sensors 22, 24 are fixed relative to each other and configured to 

30 move in tandem to selectively position reference axis Rl relative to a desired acoustic 
signal source. It is also preferred that sensors 22, 24 be microphones of a 
conventional variety, such as omnidirectional dynamic microphones. In other 
embodiments, a different sensor type may be utilized as would occur to one skilled in 
the art. 
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Referring additionally to FIG. 2, a signal How diagram illustrates various 
processing stages for the embodiment shown in FIG. 1. Sensors 22, 24 provide 
analog signals Lp(t) and Rp(t) corresponding to the left sensor 22, and right sensor 24, 
respectively. Signals Lp(t) and Rp(t) are initially input to processor 30 in separate 
processing channels L and R. For each channel L, R, signals Lp(t) and Rp(t) are 
conditioned and filtered in stages 32a, 32b to reduce aliasing, respectively. After 
filter stages 32a, 32b, the conditioned signals Lp(t), Rp(t) are input to corresponding 
Analog to Digital (A/D) converters 34a, 34b to provide discrete signals Lp(k), Rp(k), 
where k indexes discrete sampling events. In one embodiment, A/D stages 34a, 34b 
sample signals Lp(t) and Rp(t) at a rate of at least twice the frequency of the upper 
end of the audio frequency range to assure a high fidelity representation of the input 
signals. 

Discrete signals Lp(k) and Rp(k) are transformed from the time domain to the 
frequency domain by a short-term Discrete Fourier Transform (DFT) algorithm in 
stages 36a, 36b to provide complex-valued signals XLp(m) and XRp(m). Signals 
XLp(m) and XRp(m) are evaluated in stages 36a, 36b at discrete frequencies/™, 
where m is an index (m-1 to m=M) to discrete frequencies, and index p denotes the 
short-term spectral analysis time frame. Index p is arranged in reverse chronological 
order with the most recent time frame being p =1, the next most recent time frame 
being p = 2, and so forth. Preferably, frequencies M encompass the audible frequency 
range and the number of samples employed in the short-term analysis is selected to 
strike an optimum balance between processing speed limitations and desired 
resolution of resulting output signals. In one embodiment, an audio range of 0.1 to 6 
kHz is sampled in A/D stages 34a, 34b at a rate of at least 12.5 kHz with 512 samples 
per short-term spectral analysis time frame. In alternative embodiments, the 
frequency domain analysis may be provided by an analog filter bank employed before 
A/D stages 34a, 34b. It should be understood that the spectral signals XLp(m) and 
XRp(m) may be represented as arrays each having a IxM dimension corresponding to 
the different frequencies / m - 

Spectral signals XLp(m) and XRp(m) are input to dual delay line 40 as 
further detailed in FIG. 3. FIG. 3 depicts two delay lines 42, 44 each having N 
number of delay stages. Each delay line 42, 44 is sequentially configured with delay 
stages Dl through Dn- Delay lines 42, 44 are configured to delay correspondmg 
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input signals in opposing directions from one delay stage to the next, and generally 
correspond to the dual hearing channels associated with a natural binaural hearing 
process. Delay stages D\, D2, D3, . . ., E>N-2> D N-b and each dela y 311 in P ut 
signal by corresponding time delay increments X\, ij, X3, . . ., Xjs[-2> ^N-l » am * X N> 
5 (collectively designated X[ ), where index i goes from left to right. For delay line 42, 
XLp(m) is alternatively designated XLp^m). XLp ! (m) is sequentially delayed by 
time delay increments X\, X2, X3, . . ., Xn-2> *N-1. 411(1 T N to produce delayed outputs 
at the taps of delay line 42 which are respectively designated XLp 2 (m), XLp 3 (m), 
Xlp 4 (m), . . ., XLpN-Hm), XLp N (m), and XLp N+1 (m); and collectively designated 
10 XLp'Cm)). For delay line 44, XRp(m) is alternatively designated XRp N+1 (m). 

XR p N+l( m ) i s sequentially delayed by time delay increments Tj, X2, X3, . . ., xn_2, 

t^.i, and X>j to produce delayed outputs at the taps of delay line 44 which are 

respectively designated: XRp N (m), XRpN-Um), XRp N " 2 (m), . . XLp 3 (m), 
XLp 2 (m), and Xlp l (m); and collectively designated XRp^m). The input spectral 
15 signals and the signals from delay line 42, 44 taps are arranged as input pairs to 

operation array 46. A pair of taps from delay lines 42, 44 is illustrated as input pair P 
in FIG. 3. 

Operation array 46 has operation units (OP) numbered from 1 to N+l, 
depicted as OP1, OP2, OP3, OP4,..., OPN-2, OPN-1, OPN, OPN+1 and collectively 

20 designated operations OPi. Input pairs from delay lines 42, 44 correspond to the 
operations of array 46 as follows: OPlfXLpkm), XRpkm)], OP2[XLp 2 (m), 
XRp 2 (m)], OP3[XLp3(m), XRp 3 (m)], OP4[XLp 4 (m), XRp 4 (m)],..., 
OPN-2[XLp(N- 2 )(m), XRp(N- 2 )( m )], OPN-1 [XLp(N-D(m), XRp(N-l)(m)], 
OPN[XLpN(m), XRpN(m)], and OPN+l[XLp(N + l)(m), XRp(N +l )(m)]; where 

25 OPiCXLp^m), XRp^m)] indicates that OPi is determined as a function of input pair 
XLp'(m), XRp^m). Correspondingly, the outputs of operation array 46 are XpHm), 
Xp 2 (m), Xp3(m), Xp 4 (m), XpCN-2) (m) , Xp(N-U(m), XpN(m), and Xp(N + 0(m) 

(collectively designated Xp^m)). 

For i = 1 to i < N/2, operations for each OPi of array 46 are determined in 
30 accordance with complex expression 1 (CE1) as follows: 



XLpKm) - XRpKm) 



Xp'(m) = : » 

exp[-j2n(Ti-H...+T>f/2y m ] - exp[j27r(T ((N/2)+l) + - +T (N-i+l))/m] 

where exp[argument] represents a natural exponent to the power of the argument, and 
imaginary number j is the square root of -1. For i > ((N/2) + 1) to i = N+l, operations 
of operation array 46 are determined in accordance complex expression 2 (CE2) as 
follows: 

XLpi(m) - XRpKm) 

Xpi(m) « , 



exptf2*(T((N/2)+i^ 

where expfargument] represents a natural exponent to the power of the argument, and 
imaginary number j is the square root of -1. For i = (N/2)+l, neither CE1 nor CE2 is 
performed. 

An example of the determination of the operations for N = 4 (i=l to i=N+l) 
is as follows: 

i = 1, CE1 applies as follows: 

XLp 1 (m)-XRp 1 (m) 



Xp l(m) = 

exp[-j2n(T 1 +x 2 )/" m ] - expD2n(X3 + X4l/ r m ] 

i = 2 < (N/2), CE1 applies as follows: 

XLp 2 (m) - XRp 2 ( m) 



Xp2(m) = 

exp[-j2n(T 2 V m ] - exp[j27t(X3V* m ] 
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= 3: Not applicable, (N/2) < i < ((N/2)+l); 
= 4, CE2 applies as follows: 

XLp 4 (m) - XRp 4 (m) 



Xp 4 (m) = 



-; and, 



expG27i(T3Vm ]-exp[-j2«(X2/m] 
= 5, CE2 applies as follows: 



XLp 5 (m) - XRp 5 (m) 



Xp 5 (m) = -- 



expU2K(T 3 +i:4V"m]- ex P[-j 27l ( X l +X2)/ " ml 



Referring to FIGS. 1-3, each OPi of operation array 46 is defined to be 
representative of a different azimuthal position relative to reference ax lS R. The 
ZZf operation, OPi where i - ((N/2) +1 ), represents the location of the reference 
axis and source 12. For the example N=4, this center operation corresponds to i - 3. 
Tnis arrangement is analogous to the different interaural time differences associated 
with a natural binaural hearing system. In these natural systems, there is a relaUve 
position in each sound passageway within the ear that corresponds toamaxunurnm 
phase" peak for a given sound source. Accordingly, each operation of array 46 
epresents a position corresponding to a potential azimuthal or angular posmon range 
Tr a sound source, with the center operation representing a source at the zero — 
..asourcealig^edwithreferenceaxisR. For an environment having a smgle source 
without noise or interference, determining the signal pair with the maxunum strength 
ma y be sufficient to locate the source with little additional processing; however, m 
noisy or multiple source environments, further processing may be needed to properly 

estimate locations. • „.i matrix 

It shou.d be understood that dua. delay Une 40 provides a two d.mens.ona. matnx 

of outputs with HH coiumns corresponding to Xpkm), and M rows correspond.„ g to 

each discrete fluency , m of Xpkm>. This (N + .)xM matrix is determined for each 
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short-term spectral analysts interval p. Furthermore, by subtracting XRp'M from 
XLp'(m), the denominator of each expression CE1, CE2 is arranged to prov.de a 
minimum vaiue of Xpkm) when the signal pair is "in-phase" at the given frequency 
/m . Localization stage 70 uses this aspect of expressions CE1. CE2 to evaluate the 

location of source 14 relative to source 12. 

Localization stage 70 accumulates P number of these matrices to determme the 
Xpi(m) representative of the position of source 14. For each column i, loca.tza.ton 
s ,age 70 performs a summation of the amplitude of |Xpi(m)| to the second power over 
& e q uenci=s/ m fromm=t torn-M. The summation is then multiplied by the tnverse 
of M to find an average spectral energy as follows: 

M 

Xavgp' = (l/M) Z lXp'(m)| 2 . 
m-1 

The resulting averages, Xavgpi m then time averaged over the P most recent 
spectral-analysis time frames indexed by p in accordance with: 

P 

X' - £ YP Xavgp 1 , 
p-1 

where yp are empirically determined weighting factors. In one embodiment, the yp 
factors are preferably between 0.8 5 P and 0.90 P , where p is the short-term spectra! 
analysis time frame index. The * are analyzed to determine the minimum value, 
min(Xi). me index i of min(Xi), designated "I." estimates the column representing 
the azimuthal location of source 14 relative to source 12. 

„ has been discovered that the spectral content of a desired signal from source 
; ,2 when approximately aligned with reference axis Rl. can be estimated from 
Xpl(m). mother words, the spectral signal output by array 46 which most closely 
corresponds ,0 the relative ,oca,ion of the "off-axis" source ,4 contemporaneously 
provides a spectral representation of a signal emanating from source 12. As a result, 
the signal processing of dual delay line 40 not only facilitates localization of source 
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14, but also provides a spectral estimate of the desired signal with only minimal post- 
localization processing to produce a representative output. 

Post-localization processing includes provision of a designation signal by 
localization stage 70 to conceptual "switch" 80 to select the output column Xp^m) of 
5 the dual delay line 40. The Xp^m) is routed by switch 80 to an inverse Discrete 
Fourier Transform algorithm (Inverse DFT) in stage 82 for conversion from a 
frequency domain signal representation to a discrete time domain signal 
representation denoted as s(k). The signal estimate s(k) is then converted by Digital 
to Analog (D/A) converter 84 to provide an output signal to output device 90. 
10 Output device 90 amplifies the output signal from processor 30 with amplifier 

92 and supplies the amplified signal to speaker 94 to provide the extracted signal from 
a source 12. 

It has been found that interference from off-axis sources separated by as little 
as 2 degrees from the on axis source may be reduced or eliminated with the present 

15 invention — even when the desired signal includes speech and the interference 
includes babble. Moreover, the present invention provides for the extraction of 
desired signals even when the interfering or noise signal is of equal or greater relative 
intensity. By moving sensors 22, 24 in tandem the signal selected to be extracted may 
correspondingly be changed. Moreover, the present invention may be employed in an 

20 environment having many sound sources in addition to sources 12, 14. In one 
alternative embodiment, the localization algorithm is configured to dynamically 
respond to relative positioning as well as relative strength, using automated learning 
techniques. In other embodiments, the present invention is adapted for use with 
highly directional microphones, more than two sensors to simultaneously extract 

25 multiple signals, and various adaptive amplification and filtering techniques known to 
those skilled in the art. 

The present invention greatly improves computational efficiency compared to 
conventional systems by determining a spectral signal representative of the desired 
signal as part of the localization processing. As a result, an output signal 

30 characteristic of a desired signal from source 12 is determined as a function of the 
signal pair XLp^m), XRp^m) corresponding to the separation of source 14 from 
source 12. Also, the exponents in the denominator of CE1, CE2 correspond to phase 
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difference of frequencies f m resulting from the separation of source 12 from 14. 
Referring to the example of N=4 and assuming that 1=1, this phase difference is 
-27r(Ti+T2)/" m (for delay line 42) and 2tz (X 3+t4Vm ( for delay line 44) and 
corresponds to the separation of the representative location of off-axis source 14 from 
5 the on-axis source 12 at i=3. Likewise the time increments, X1+X2 and X3+X4, 

correspond to the separation of source 14 from source 12 for this example. Thus, 
processor 30 implements dual delay line 40 and corresponding operational 
relationships CE1, CE2 to provide a means for generating a desired signal by locating 
the position of an interfering signal source relative to the source of the desired signal. 
10 It is preferred that X; be selected to provide generally equal azimuthal 

positions relative to reference axis R. In one embodiment, this arrangement 
corresponds to the values of Xj changing about 20% from the smallest to the largest 

value. In other embodiments, X[ are all generally equal to one another, simplifying 

the operations of array 46. Notably, the pair of time increments in the numerator of 
15 CE1, CE2 corresponding to the separation of the sources 12 and 14 become 
approximately equal when all values Xj are generally the same. 

Processor 30 may be comprised of one or more components or pieces of 
equipment. The processor may include digital circuits, analog circuits, or a 
combination of these circuit types. Processor 30 may be programmable, an integrated 

20 state machine, or utilize a combination of these techniques. Preferably, processor 30 
is a solid state integrated digital signal processor circuit customized to perform the 
process of the present invention with a minimum of external components and 
connections. Similarly, the extraction process of the present invention may be 
performed on variously arranged processing equipment configured to provide the 

25 corresponding functionality with one or more hardware modules, firmware modules, 
software modules, or a combination thereof. Moreover, as used herein, "signal" 
includes, but is not limited to, software, firmware, hardware, programming variable, 
communication channel, and memory location representations. 

Referring to FIG. 4A, one application of the present invention is depicted as 

30 hearing aid system 1 10. System 110 includes eyeglasses G with microphones 122 and 
124 fixed to glasses G and displaced from one another. Microphones 122, 124 are 
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operatively coupled to hearing aid processor 130. Processor 130 is operatively 
coupled to output device 190. Output device 190 is positioned in ear E to provide an 
audio signal to the wearer. 

Microphones 122, 124 are utilized in a manner similar to sensors 22, 24 of the 

5 embodiment depicted by FIGS 1-3. Similarly, processor 130 is configured with the 
signal extraction process depicted in of FIGS. 1-3. Processor 130 provides the 
extracted signal to output device 190 to provide an audio output to the wearer. The 
wearer of system 1 1 0 may position glasses G to align with a desired sound source, 
such as a speech signal, to reduce interference from a nearby noise source off axis 

10 from the midpoint between microphones 122, 124. Moreover, the wearer may select a 
different signal by realigning with another desired sound source to reduce interference 
from a noisy environment. 

Processor 130 and output device 190 may be separate units (as depicted) or 
included in a common unit worn in the ear. The coupling between processor 130 and 

15 output device 190 may be an electrical cable or a wireless transmission. In one 

alternative embodiment, sensors 122, 124 and processor 130 are remotely located and 
are configured to broadcast to one or more output devices 190 situated in the ear E via 
a radio frequency transmission or other conventional telecommunication method. 

FIG. 4B shows a voice recognition system 210 employing the present invention as a 

20 front end speech enhancement device. System 210 includes personal computer C with two 
microphones 222, 224 spaced apart from each other in a predetermined relationship. 
Microphones 222, 224 are operatively coupled to a processor 230 within computer C. 
Processor 230 provides an output signal for internal use or responsive reply via speakers 
294a, 294b or visual display 296. An operator aligns in a predetermined relationship with 

25 microphones 222, 224 of computer C to deliver voice commands. Computer C is 

configured to receive these voice commands, extracting the desired voice command from a 
noisy environment in accordance with the process system of FIGS. 1-3. 

Referring to Figs. 10-13, signal processing system 310 of another embodiment of the 
present invention is illustrated. Reference numerals of system 310 that are the same as 

30 those of system 10 refer to like features. The signal flow diagram of FIG. 10 corresponds to 
various signal processing techniques of system 310. Fig. 10 depicts left and right U R" 
input channels for signal processor 330 of system 310. Channels L, R each include an 
acoustic sensor 22, 24 that provides an input signal xUO. respectively. Input signals 

xUO and W; correspond to composites of sounds from multiple acoustic sources located 
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within the detection range of sensors 22, 24. As described in connection with FIG. 1 of 
system 10, it is preferred that sensors 22, 24 be standard microphones spaced apart from 
each other at a predetermined distance D. In other embodiments a different sensor type or 
arrangement may be employed as would occur to those skilled in the art. 
5 Sensors 22, 24 are operatively coupled to processor 330 of system 310 to provide 

input signals x Ln (t) and x Rn (t) to A/D converters 34a, 34b. A/D converters 34a, 34b of 
processor 330 convert input signals x Ln (t) and x Rn (t) from an analog form to a discrete form 
as represented as x Ln (k) and x Rn (k) y respectively; where "t" is the familiar continuous time 
domain variable and "k" is the familiar discrete sample index variable. A corresponding 

10 pair of preconditioning filters (not shown) may also be included in processor 330 as 
described in connection with system 10. 

Digital Fourier Transform (DFT) stages 36a, 36b receive the digitized input signal 
pairx^^l and x Rn (k) from converters 34a, 34b, respectively. Stages 36a, 36b transform 
input signals as x Ln (k) and x Rn (k) into spectral signals designated Xi n (m) and X Rn (m) using a 

15 short term discrete Fourier transform algorithm. Spectral signals Xufm) and X R/t fm) are 
expressed in terms of a number of discrete frequency components indexed by integer m\ 
where m=l , 2, . . ., A/. Also, as used herein, the subscripts L and R denote the left and right 
channels, respectively/and n indexes time frames for the discrete Fourier transform 
analysis. 

20 Delay operator 340 receives spectral signals Xufm) andXx„(m) from stages 36a, 36b, 

respectively. Delay operator 340 includes a number of dual delay lines (DDLs) 342 each 
corresponding to a different one of the component frequencies indexed by m. Thus, there 
are M different dual delay lines 342 utilized. However, only dual delay lines 342 
corresponding to m=\ and m=Mare shown in Fig. 10 to preserve clarity. The remaining 

25 dual delay lines corresponding to m-2 through m=(M-l) are represented by an ellipsis to 
preserve clarity. Alternatively, delay operator 340 may be described as a single dual delay 
line that simultaneously operates on M frequencies like dual delay line 40 of system 10. 

The pair of frequency components from DFT stages 36a, 36b corresponding to a 
given value of m are inputs into a corresponding one of dual delay lines 342. For the 

30 examples illustrated in Fig. 10, spectral signal component pair Xi„(m=I) and X^nf^-l) is 
sent to the upper dual delay line 342 for the frequency corresponding to /n=l; and spectral 
signal component pair X Ln (m=M) and X Rn (m=M) is sent to the lower dual delay line 342 for 
the frequency corresponding to m-M. Likewise, common frequency component pairs of 
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X Ln (m) and X Rn (m) for frequencies corresponding to m=2 through m=(A/-l) are each sent to 
a corresponding dual delay line as represented by ellipses to preserve clarity. 

Referring additionally to Fig. 11, certain features of dual delay line 342 are further 
illustrated. Each dual delay line 342 includes a left channel delay line 342a receiving a 
5 corresponding frequency component input from DFT stage 36a and right channel delay line 
342b receiving a corresponding frequency component input from DFT stage 36b. Delay 
lines 342a, 342b each include an odd number I of delay stages 344 indexed by z = l, 2, I. 
The I number of delayed signal pairs are provided on outputs 345 of delay stages 344 and 
are correspondingly sent to complex multipliers 346. There is one multiplier 346 

10 corresponding to each delay stage 344 for each delay line 342a, 342b. Multipliers 346 
provide equalization weighting for the corresponding outputs of delay stages 344. Each 
delayed signal pair from corresponding outputs 345 has one member from a delay stage 344 
of left delay line 342a and the other member from a delay stage 344 of right delay line 342b. 
Complex multipliers 346 of each dual delay line 342 output corresponding products of the I 

15 number of delayed signal pairs along taps 347. The I number of signal pairs from taps 347 
for each dual delay line 342 of operator 340 are input to signal operator 350. 

For each dual delay line 342, the I number of pairs of multiplier taps 347 are each 
input to a different Operation Array (OA) 352 of operator 350. Each pair of taps 347 is 
provided to a different operation stage 354 within a corresponding operation array 352. In 

20 Fig. 11, only a portion of delay stages 344, multipliers 346, and operation stages 354 are 
shown corresponding to the two stages at either end of delay lines 342a, 342b and the 
middle stages of delay lines 342a, 342b. The intervening stages follow the pattern of the 
illustrated stages and are represented by ellipses to preserve clarity. 

For an arbitrary frequency co m , delay times x, are given by equation (1) as follows: 

where, / is the integer delay stage index in the range (/=!,..., I); ITD^x = D/c is the 
maximum Intermicrophone Time Difference; D is the distance between sensors 22, 24; and 
30 c is the speed of sound. Further, delay times x, are antisymmetric with respect to the 
midpoint of the delay stages corresponding to i=(/+l)/2 as indicated in the following 
equation (2): 

itd^ t (/_/ + d_i K nr> . i-i * 
r- 1 /-! «-7l— i"-"«7n' , " ) — * (2) 

35 
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The azimuthal plane may be uniformly divided into I sectors with the azimuth 
position of each resulting sector being given by equation (3) as follows: 

0. =y^l80° -90°, fc=l, ...,/■ (3) 

The azimuth positions in auditory space may be mapped to corresponding delayed 
gnal pairs along each dual delay line 342 in accordance with equation (4) as follows: 



st 



ITD 



^sin^, i=l, ...,/. (4) 



The dual delay-line structure is similar to the embodiment of system 10, 
except that a different dual delay line is represented for each value of m and 
multipliers 346 have been included to multiply each corresponding delay stage 344 by 
an appropriate one of equalization factors a, (m)\ where i is the delay stage index 
15 previously described. Preferably, elements o, (m) are selected to compensate for 

differences in the noise intensity at sensors 22, 24 as a function of both azimuth and 
frequency. 

One preferred embodiment for determining equalization factors oi(m) assumes 
amplitude compensation is independent of frequency, regarding any departure from 
20 this model as being negligible. For this embodiment, the amplitude of the received 
sound pressure | p | varies with the source-receiver distance r in accordance with 
equations (Al) and (A2) as follows: 

Iplocl, (Al) 



iEiLi. (A2) 



where | p/. | and | p* | are the amplitude of sound pressures at sensors 22, 24. Fig. 12 
depicts sensors 22, 24 and a representative acoustic source SI within the range of 
30 reception to provide input signals x Ln (t) and x Rn (t). According to the geometry 

illustrated in Fig. 12, the distances r L and r A from the source SI to the left and right 
sensors, respectively, are given by equations (A3) and (A4), as follows: 
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r L = ^IsiaO, +D/2) 2 + (/cos0 f ) 2 = V* + *Dsin3 +D 2 /4 , (A3) 

r„ = VcZsinfl, -D/2) 2 + (/cos0 i ) 2 = V* -IDsinO, +D 2 /4 . (A4) 

For a given delayed signal pair in the dual delay-line 342 of FIG. 1 1 to 
become equalized under this approach, the factors tufa) and o w+/ {m) must satisfy 
equation (A5) as follows: 

l P Ja i (m)=JpJa / . i+l (m). ^ 
Substituting equation (A2) into equation (A5), equation (A6) results as follows: 

_ «, (m) ( A6) 

r R " a,. M (m)' 

By defining the value of aifa) in accordance with equation (A7) as follows: 
a, (m) = ATV/ 2 +/£>sin0,+D74 , ( A7) 

where, AT is in units of inverse length and is chosen to provide a convenient amplitude 
level, the value of a/.,-/ (m; is given by equation (A8) as follows: 



where, the relation sin0/. ( w=-sin0, can be obtained by substituting I-i+l into i in 
equation (3). By substituting equations (A7) and (A8) into equation (A6), it may be 
verified that the values assigned to a, (m) in equation (A7) satisfy the condition 
established by equation (A6). 

25 After obtaining the equalization factors a, fa) in accordance with this 

embodiment, minor adjustments are preferably made to calibrate for asymmetries in 
the sensor arrangement and other departures from the ideal case such as those that 
might result from media absorption of acoustic energy, an acoustic source geometry 
other than a point source, and dependence of amplitude decline on parameters other 

30 than distance. 

After equalization by factors a, (m) with multipliers 346, the in-phase desired 
signal component is generally the same in the left and right channels of the dual delay 
lines 342 for the delayed signal pairs corresponding to / = /signal = s, and the in-phase 
noise signal component is generally the same in the left and right channels of the dual 
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delay lines 342 for the delayed signal pairs corresponding to / = i no ise = g for the case 
of a single, predominant interfering noise source. The desired signal at i=s may be 
expressed as S n (m) = A s exp[f(co m t + <J> S )]; and the interfering signal at i=g may be 
expressed as G n {m) = A g exp{jX(o m t+^ g )], where 4> x and $ g denote initial phases. Based 
on these models, equalized signals a t {m)X Ln (0 (m) for the left channel and 
a f .i+ { (m)X Rn (0 (m) for the right channel at any arbitrary point i (except i = s) along dual 
delay lines 342 may be expressed in equations (5) and (6) as follows: 

a, (m) X £ (m) = A, exp yla> m (f + r, - t, ) + ] + A, exp ;[a> m (f + r, - t, ) + <f> g ] , 
a / ^ I (m)X^ ) (m) = 



(5) 



a / ^ I (m)X^ ) (m) = 

A, exp /[to,, (/ + r,_, +l - r /w>1 ) + <p t ] + A g exp y[aj m (r + r,_ f - t,_, +i ) + 0 f 1 • 

where equations (7) and (8) further define certain terms of equations (5) and (< 
follows: 

X£(m) = X u {m) exp(-j27c/ m x^ (7) 
X£ (m) = X^(m) exp^Tc/n^ W (8) 



(6) 
(6) as 



Each signal pair a t {m)X Ln (i) (m) and a/. l+ ,(m)X^(/n) is input to a corresponding 
20 operation stage 354 of a corresponding one of operation arrays 352 for all m\ where 
each operator array 352 corresponds to a different value of m as in the case of dual 
delay lines 342. For a given operation array 352, operation stages 354 corresponding 
to each value of/, except *=s, perform the operation defined by equation (9) as 
follows: 

25 x o )( , <*, (m) Xff (m) - g,_, Al (m)Xff (m) f 

" lmJ (a, Aa^expCM.Cr, -t,.)]-^,^, /a/.^expU^Cr,^ 

fori*s. (9) 

If the value of the denominator in equation (9) is too small, a small positive constant e 
30 is added to the denominator to limit the magnitude of the output signal X n (i \m). No 
operation is performed by the operation stage 354 on the signal pair corresponding to 
i=s for all m (all operation arrays 352 of signal operator 350). 

Equation (9) is comparable to the expressions CE1 and CE2 of system 10; 
however, equation (9) includes equalization elements a,- (m) and is organized into a 
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single expression. With the outputs from operation array 352, the simultaneous 
localization and identification of the spectral content of the desired signal may be 
performed with system 310. Localization and extraction with system 310 are further 
described by the signal flow diagram of Fig. 13 and the following mathematical 
5 model. By substituting equations (5) and (6) into equation (9), equation ( 1 0) results 
as follows: 

X;°(m) = S„(m) + G„(m)-<(m). i** < 10) 

10 where equation (11). further defines: 

u '-' (m) = (a, /a,)exp[M.(T, la,-,+d exp[;Q\, 
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By applying equation (2) to equation (1 1), equation (12) results as follows: 



(a. /a r )exp[M,(T, -*,)]-(«,.,>, /«/_.+! )c*p[-M.(*, ~ x > )] i* s . (12) 

u '' (m) = (a,- /a,)exp[M,(r, -*,)]-(«,_,♦, /a^^expt-y^CT, -t ; )] ' 



20 



The energy of the signal */V) is expressed in equation (13) as follows: 
\xl!\mtf =\S n (m) + GSm)vl»{m)\\ < l3 > 

25 A signal vector may be defined: 

x (o =(x co a)>X (o (2)) fX (.) (M)X (o (l) x<o (A f) X^(D X2'(M)) r . 

30 where, denotes transposition. The energy ||^|| 2 2 of the vector*" is given by 

equation (14) as follows: 

ix ( i:=XII^(m)r=Zl|5.(m) + G„(m).t;::](m)| . 
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Equation (14) is a double summation over time and frequency that approximates a 
double integration in a continuous time domain representation. 
Further defining the following vectors: 

s = (S 1 (l),S I (2),...,S,(M),S 2 Cl) S a (Af) S„(l) S„(M)) T . and 

g (0 =(G,(l)t^(l),^^^ 

G N (1)<> (1).. . . , G„ ( M)vj'l ( Af )) r , where i = 1,. . . . /, 



the energy of vectors s and gf° are respectively defined by equations (15) and (16) as 
10 follows: HB-if|S.(«)| a (15) 

«-I libit 

hi 



For a desired signal that is independent of the interfering source, the vectors s 
15 andg (i) are orthogonal. In accordance with the Theorem of Pythagoras, equation (17) 
results as follows: 



ouows: 



•J. (17) 

20 Because \gf i} || \ > 0, equation (18) results as follows: 



x'tils't. (18) 



25 The equality in equation ( 1 8) is satisfied only when || g {i) || i = 0, which happens if 
either of the following two conditions are met: (a) G„(m) = 0, i.e., the noise source is 
silent - in which case there is no need for doing localization of the noise source and 
noise cancellation; and (b) v sg (0 (m) = 0; where equation (12) indicates that this second 
condition arises for i = g = /"noise Therefore, ||^ || \ has its minimum at /= g = / no isc 

30 which according to equation ( 1 8) is || s || I • Equation ( 1 9) further describes this 
condition as follows: 



W=k'-t = nun|x<t. . (19) 
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Thus, the localization procedure includes finding the position / n0 ise along the 
operation array 352 for each of the delay lines 342 that produces the minimum value 
of ||*^|| 2 . Once the location i noisc along the dual delay line 342 is determined, the 
azimuth position of the noise source may be determined with equation (3). The 
estimated noise location z no j SC may be utilized for noise cancellation or extraction of 
the desired signal as further described hereinafter. Indeed, operation stages 354 for all 
m corresponding to / = z n0Ise provide the spectral components of the desired signal as 
given by equation (20): 

S m (m) = (m) = S H (m) + G m (m) ■ v^\m) = S a (m) . (20) 

Localization operator 360 embodies the localization technique of system 310. 
Fig. 13 further depicts operator 360 with coupled pairs of summation operators 362 
and 364 for each value of integer index /; where z"=l ,. . .,1. Collectively, summation 
operators 362 and 364 perform the operation corresponding to equation (14) to 
generate || x 0) || I for each value of /. For each transform time frame n y the summation 
operators 362 each receive X n (0 (l) through X„ (0 (M) inputs from operation stages 354 
corresponding to their value of / and sums over frequencies m=\ through m=M. For 
the illustrated example, the upper summation operator 362 corresponds to z=l and 
receives signals X n (l) (l) through X n (1) (M) for summation; and the lower summation 
operator 362 corresponds to z=I and receives signals X n (I) (l) through X n (I) (M) for 
summation. 

Each summation operator 364 receives the results for each transform time 
frame n from the summation operator 362 corresponding to the same value of i and 
accumulates a sum of the results over time corresponding to n=l through «=N 
transform time frames; where N is a quantity of time frames empirically determined 
to be suitable for localization. For the illustrated example, the upper summation 
operator 364 corresponds to z=l and sums the results from the upper summation 
operator 362 over N samples; and the lower summation operator 364 corresponds to 
i-l and sums the results from the lower summation operator 362 over N samples. 

The I number of values of \\x fi) || \ resulting from the I number of summation 
operators 364 are received by stage 366. Stage 366 compares the I number of \x (i) || \ 
values to determine the value of / corresponding to the minimum ||^|||. This value 
of / is output by stage 366 as / = g = / noisc . 
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Referring back to Fig. 10, post-localization processing by system 310 is 
further described. When equation (9) is applied to the pair inputs of delay lines 342 at 
i~gy it corresponds to the position of the off-axis noise source and equation (20) 
shows it provides an approximation of the desired signal S„(m). To extract signal 
5 S„(m), the index value i=g is sent by stage 366 of localization unit 360 to extraction 
operator 380. In response to g y extraction operator 380 routes the outputs X n (8) (l) 
through X n (g) (M) = S n (m) to Inverse Fourier Transform (IFT) stage 82 operatively 
coupled thereto. For this purpose, extraction operator 380 preferably includes a 
multiplexer or matrix switch that has IxM complex inputs and M complex outputs; 

10 where a different set of M inputs is routed to the outputs for each different value of 
the index / in response to the output from stage 366 of localization operator 360. 

Stage 82 converts the M spectral components received from extraction unit 
380 to transform the spectral approximation of the desired signal, S„(m), from the 
frequency domain to the time domain as represented by signal s n (k). Stage 82 is 

15 operatively coupled to digital-to-analog (D/A) converter 84. D/A converter 84 
receives signal s n (k) for conversion from a discrete form to an analog form 
represented by s„(f). Signal s*(f) is input to output device 90 to provide an auditory 
representation of the desired signal or other indicia as would occur to those skilled in 
the art. Stage 82, converter 84, and device 90 are further described in connection with 

20 system 10. 

Another form of expression of equation (9) is given by equation (21) as 
follows: 

X? (m) = w u (m) X £ (m) + (m) X £ (m) . (21) 

25 

The terms w L „ and w Rn are equivalent to beamforming weights for the left and right 
channels, respectively. As a result, the operation of equation (9) may be equivalently 
modeled as a beamforming procedure that places a null at the location corresponding 
to the predominant noise source, while steering to the desired output signal s n (r). 
30 Fig. 14 depicts system 410 of still another embodiment of the present 

invention. System 410 is depicted with several reference numerals that are the same 
as those used in connection with systems 10 and 310 and are intended to designate 
like features. A number of acoustic sources 412, 414, 416, 418 are depicted in Fig. 14 
within the reception range of acoustic sensors 22, 24 of system 410. The positions of 
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sources 412, 414, 416, 418 are also represented by the azimuth angles relative to axis 
AZ that are designated with reference numerals 412a, 414a, 416a, 418a. As depicted, 
angles 412a, 414a, 416a, 418a correspond to about 0°, +20°, +75°, and -75°, 
respectively. Sensors 22, 24 are operatively coupled to signal processor 430 with axis 
AZ extending about midway therebetween. Processor 430 receives input signals 
XLn(t), x Rn (t) from sensors 22, 24 corresponding to left channel L and right channel R 
as described in connection with system 310. Processor 430 processes signals x^t), 
x Rn (t) and provides corresponding output signals to output devices 90, 490 
operatively coupled thereto. 

Referring additionally to the signal flow diagram of Fig. 15, selected features 
of system 410 are further illustrated. System 410 includes D/A converters 34a, 34b 
and DFT stages 36a, 36b to provide the same left and right channel processing as 
described in connection with system 310. System 410 includes delay operator 340 
and signal operator 350 as described for system 310; however it is preferred that 
equalization factors a,(m) ...,/) be set to unity for the localization processes 
associated with localization operator 460 of system 410. Furthermore, localization 
operator 460 of system 410 directly receives the output signals of delay operator 340 
instead of the output signals of signal operator 350, unlike system 310. 

The localization technique embodied in operator 460 begins by establishing 
two-dimensional (2-D) plots of coincidence loci in terms of frequency versus azimuth 
position. The coincidence points of each loci represent a minimum difference 
between the left and right channels for each frequency as indexed by m. This 
minimum difference may be expressed as the minimum magnitude difference 
5X n (i) (m) between the frequency domain representations X Lp (i \m) and X Lp (i) (m\ at each 
discrete frequency m f yielding M/2 potentially different loci. If the acoustic sources 
are spatially coherent, then these loci will be the same across all frequencies. This 
operation is described in equations (22)-(25) as follows: 

i n (m) = arg min (m) } , m=l, M/2. (22) 

^ l) (m)=|x^ ) (m)-X^(m)|, /=1 /; m=l M/2, (23) 

X £ (m) = X u (m) exp(-y 2m { m I M) , /= 1 /; m= 1 , . . . , M/2, (24) 

X% ("0 = Xr* (m) exp(-y2^r / ^ 1 m / M) , 1 ,..../; m= 1 ..... , M/2. (25) 
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, f ,he ampmudes of .he ,=ft and righ, channel are genera„y .he same a. a 
gWen posuion a,ong dua, de,ay b. 342 of sys,em 4,0 as indexed L by , 
les of «*» for .he corresponding va,ue of , is minted, ,f no, — 
z ero is no,ed .h,, despi.« in.er-sensor in,ens,y differences, eouatonon fa .ors 
1, (,-...., 0 shouid he maintained Cose .0 unity for the purpose of co.ncdence 
diction; omerwise, ft. minima, «<V) - — P"* » * 
(coincidence) locations. 

( An aUemarive approach may he hased on ***** coincdence oc 

the phase difference. For .his phase difference approach, the ^ 

difflnce berween .he ,eft and righ, channe, signais a. posmons a,ong the dua, de a y 

ll34 2 ,asinde X edhy i ,are,oc,eda S descrihedh y 0 1 e f oUowi„ge q ua..ons(26) 

and(27): i . (m) = a rgmin( 0 x;"(»))- "- 1 M °" (26 > 

i 

«;"("■) =|ta(XS(m)XS.'(m) , i i=l ■ <«> 

where , taM deno.es *e imaginary par. of me argumen, and .he supe^t 
deno.es a comp.ex co„iuga,e. Since .he phase difference .echn.au e . * *e 
minimum angie berween rwo comp.ex vec.ors, .here is aiso no need .o compensa.e 

the inter-sensor intensity difference. 

Whiie either .he magnitude or phase difference approach may be effecttv 
, w.hou< further processing .o ,oca,ize a singie source, muhipie sources often em,. 
S pec,ra„y overling signais ,ha, iead .0 coincidence ioci which correspon » 

sources a. .he same frequency). Fig. 17 i..us.r,es a 2-D co.ncdence pWSOO » 
.erms of fluency in Her,. (Hz) aiong me vertica, axis and az,mu.h posu.on ,n 
, 5 Z cs aion\ ,he honzonta, axis. P,o, 500 indica,es .wo sources correspond.ng ,o me 
Liy ve ica„ya,igned ,ocus5,2a a, abou, -20 degrees and mever,^ 

gned ocus 5,2b a, abou. ♦ 40 degrees. P,o. 500 a,so inciudes m,s,den«fied or 

5 14ea ,otherazimu.hspos„ 1 o n s,hat 
phantom source po,n<s 5,4a. 5,4b, 514c, 5,4d 5 p lo ,shaving 
correspond ,0 frequencies where bo,h sources have s.gmfican. energy 

even more complex plot. 
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To reduce the occurrence of phantom information in the 2-D coincidence plot 
data, localization operator 460 integrates over time and frequency. When the signals 
are not correlated at each frequency, the mutual interference between the signals can 
be gradually attenuated by the temporal integration. This approach averages the 
locations of the coincidences, not the value of the function used to determine the 
minima, which is equivalent to applying a Kronecker delta function, 5(/-/„(m)) to 
8X n (i) (m) and averaging the S(i-i n (m)) over time. In turn, the coincidence loci 
corresponding to the true position of the sources are enhanced. Integration over time 
applies a forgetting average to the 2-D coincidence plots acquired over a 
predetermined set of transform time frames from n =1,..., N; and is expressed by the 
summation approximation of equation (28) as follows: 

^(^.m) = X^-"5(/-/ )l (m)), Ml /; m =l M/2t (28) 

1-1 

where, 0 <P<1 is a weighting coefficient which exponentially de-emphasizes (or 
forgets) the effect of previous coincidence results, 8(«) is the Kronecker delta 
function, 8, represents the position along the dual delay-lines 342 corresponding to 
spatial azimuth 9, [equation (2)], and N refers to the current time frame. To reduce 
the cluttering effect due to instantaneous interactions of the acoustic sources, the 
results of equation (28) are tested in accordance with the relationship defined by 
equation (29) as follows: 



fP„(ft,m), 



P H &„m)>r ( 2 9) 
otherwise. 



where F > 0, is an empirically determined threshold. While this approach assumes tl 
inter-sensor delays are independent of frequency, it has been found that departures 
from this assumption may generally be considered negligible. 

By integrating the coincidence plots across frequency, a more robust and 
reliable indication of the locations of sources in space is obtained. Integration of 
P n (9,.m) over frequency produces a localization pattern which is a function of 
azimuth. Two techniques to estimate the true position of the acoustic sources may b 
utilized. The first estimation technique is solely based on the straight vertical traces 
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across frequency that correspond to different azimuths. For this technique, 
denotes the azimuth with which the integration is associated, such that Q d = 
results in the summation over frequency of equation (30) as follows: 



0„ and 



H N (0 d ) = X Pn (fi* >m) , d=l /. ( 30 ) 

m 

5 

where, equation (30) approximates integration over time. 

The peaks in H n (Q d ) represent the source azimuth positions. If there are Q 
sources, Q peaks in Hsifid) may generally be expected. When compared with the 
patterns 5(i-z rt (m)) at each frequency, not only is the accuracy of localization enhanced 

10 when more than one sound source is present, but also almost immediate localization 
of multiple sources for the current frame is possible. Furthermore, although a 
dominant source usually has a higher peak in //*(0rf) than do weaker sources, the 
height of a peak in H*(Q d ) only indirectly reflects the energy of the sound source. 
Rather, the height is influenced by several factors such as the energy of the signal 

15 component corresponding to 0«/ relative to the energy of the other signal components 
for each frequency band, the number of frequency bands, and the duration over which 
the signal is dominant. In fact, each frequency is weighted equally in equation (28). 
As a result, masking of weaker sources by a dominant source is reduced. In contrast, 
existing time-domain cross-correlation methods incorporate the signal intensity, more 

20 heavily biasing sensitivity to the dominant source. 

Notably, the interaural time difference is ambiguous for high frequency 
sounds where the acoustic wavelengths are less than the separation distance D 
between sensors 22, 24. This ambiguity arises from the occurrence of phase 
multiples above this inter-sensor distance related frequency, such that a particular 

25 phase difference A<)> cannot be distinguished from A({> +2*. As a result, there is not a 
one-to-one relationship of position versus frequency above a certain frequency. 
Thus, in addition to the primary vertical trace corresponding to 0 rf = 0,-. , there are also 
secondary relationships that characterize the variation of position with frequency for 
each ambiguous phase multiple. These secondary relationships are taken into 

30 account for the second estimation technique for integrating over frequency. Equation 
(31) provides a means to determine a predictive coincidence pattern for a given 
azimuth that accounts for these secondary relationships as follows: 
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(31) 



where the parameter y md is an integer, and each value of y m . d defines a contour in the 
pattern /\<9,-./w). The primary relationship is associated with y m .d =0. For a specific 
5 0 rf , the range of valid y mM is given by equation (32) as follows: 

-rTD^Ad + sin^)^/^ ^ITD^/^l-sinej (32) 

The graph 600 of Fig. 18 illustrates a number of representative coincidence 
patterns 612, 614, 616, 618 determined in accordance with equations (31) and (32); 

10 where the vertical axis represents frequency in Hz and the horizontal axis represents 
azimuth position in degrees. Pattern 612 corresponds to the azimuth position of 0°. 
Pattern 612 has a primary relationship corresponding to the generally straight, solid 
vertical line 612a and a number of secondary relationships corresponding to curved 
solid line segments 612b. Similarly, patterns 614, 616, 618 correspond to azimuth 

15 positions of -75°, 20°, and 75° and have primary relationships shown as straight 
vertical lines 614a, 616a, 61 8a and secondary relationships shown as curved line 
segments 614b, 616b, 618b, in correspondingly different broken line formats. In 
general, the vertical lines are designated primary contours and the curved line 
segments are designated secondary contours. Coincidence patterns for other azimuth 

20 positions may be determined with equations (3 1) and (32) as would occur to those 
skilled in the art. 

Notably, the existence of these ambiguities in P t \{Qi.m) may generate 
artifactual peaks in // ( v<9 rf ) after integration along Q d = 6/. Superposition of the curved 
traces corresponding to several sources may induce a noisier H t \{Q d ) term. When far 

25 away from the peaks of any real sources, the artifact peaks may erroneously indicate 
the detection of nonexistent sources; however, when close to the peaks corresponding 
to true sources, they may affect both the detection and localization of peaks of real 
sources in H t ^(Q d ). When it is desired to reduce the adverse impact of phase 
ambiguity, localization may take into account the secondary relationships in addition 

30 to the primary relationship for each given azimuth position. Thus, a coincidence 

pattern for each azimuthal direction Q d (d=\ ,...,/) of interest may be determined and 
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plotted that may be utilized as a "stencil" window having a shape defined by /\(9, m) 
(i=l f ...,/; m=l, A/). In other words, each stencil is a predictive pattern of the 
coincidence points attributable to an acoustic source at the azimuth position of the 
primary contour, including phantom loci corresponding to other azimuth positions as 
a factor of frequency. The stencil pattern may be used to filter the data at different 
values of m. 

By employing the equation (32), the integration approximation of equation 
(30) is modified as reflected in the following equation (33): 



^ m mmxJm 



(33) 



where A(Qd) denotes the number of points involved in the summation. Notably, 
equation (30) is a special case of equation (33) corresponding to y m .rf =0. Thus, 
equation (33) is used in place of equation (30) when the second technique of 
integration over frequency is desired. 

15 As shown in equation (2), both variables 0, and x, are equivalent and represent 

the position in the dual delay-line. The difference between these variables is that 8, 
indicates location along the dual delay-line by using its corresponding spatial azimuth, 
whereas x ( denotes location by using the corresponding time-delay unit of value t,- . 
Therefore, the stencil pattern becomes much simpler if the stencil filter function is 

20 expressed with x, as defined in the following equation (34): 



x - T (34) 



where, x d relates to 8 rf through equation (4). For a specific x rf , the range of valid Ym.^is 
given by equation (35) as follows: 

'^™^l2^x d )f m <^ YmM <:(ITD M /2-Tj/ m , y m4 is an integer. (35) 

Changing value of x rf only shifts the coincidence pattern (or stencil pattern) along the 
x, -axis without changing its shape. The approach characterized by equations (34) and 
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(35) may be uulized as an alternative to separate patterns for each azimuth posttton of 
interest; however, because the scaling of the delay units is uniform aiongthe dual 
delay-line, azimuthal partitioning by the dual delay-line is no. uniform, wtth the 
regtons Cose to the median plane having higher azimuthal resolution. On the other 
hand, in order to obtain an equivalent resolution in azimuth, using a uniform „ would 
require a much larger I of delay units than using a uniform 9„ 

The signal flow diagram of Fig. 16 further illustrates selected detatls 
concerning localization operator 460. With eo.uahza.ion factors a,M se, to un.ty, the 
delayed si^a, of pairs of delay stages 344 are sen, to coincidence detection operators 
462 for each frequency indexed to - to determine the coincidence points. Detecon 
operators 462 determine the minima in accordance with equation (22) or (26). Each 
coincidence detection operator 462 sends the results «-) to a corresponding pattern 
generator forthegiven™. Generators 464 build a 2-D coincidence plot foreach 
frequency indexed ,0 m and pass the results to a corresponding summation operator 
466 to perform the operation expressed in equation (28) for that given frequency. 
Summation operators 466 approximate integration over time. In F,g. 16, only 
operators 462, 464, and 466 corresponding to m =1 and m =M are illustrated to 
plerve Canty, with those corresponding ,0 - -2 through - M-l being represented 
by ellipses. 

Summation operators 466 pass results to summation operator 468 to 
approximate intention over frequency. Operators 468 may be configured ,n 
accordance with equation (30, if artifact, resulting from the secondary relattonshtps 
nigh frequencies are not present or may be ignored. Alternatively, stencil filtering 
with predictive coincidence patterns that include me secondary relationships may be 
; performed by applying equation (33) with summation operator 468. 

Referring back to Fig. 15, operator 468 outputs H N (0 d ) to output devtce 490 
map corresponding acoustic source positional information. Device 490 preferably 
inCudes a display or printer capable of providing a map representative of ft. spattal 
arrangement of the acoustic sources relative to the predetermined azimuth post tons. 
„ In addition, the acoustic sources may be localized and tracked dynamically as they 
move in space. Movement trajectories may be es.tmated from the sets of locattons 
8 („ (m ,, computed a. each sample window For other embodiments inconporatmg 
system 4,0 into a small portable unit, such as a hearing aid, output dev.ce 490 ts 
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preferably not included. In still other embodiments, output device 90 may not be 
included. 

The localization techniques of localization operator 460 are particularly suited 
to localize more than two acoustic sources of comparable sound pressure levels and 
frequency ranges, and need not specify an on-axis desired source. As such, the 
localization techniques of system 410 provide independent capabilities to localize and 
map more than two acoustic sources relative to a number of positions as defined with 
respect to sensors 22, 24. However, in other embodiments, the localization capability 
of localization operator 460 may also be utilized in conjunction with a designated 
reference source to perform extraction and noise suppression. Indeed, extraction 
operator 480 of the illustrated embodiment incorporates such features as more fully 

described hereinafter. 

Existing systems based on a two sensor detection arrangement generally only 
attempt to suppress noise attributed to the most dominant interfering source through 
beamforming. Unfortunately, this approach is of limited value when there are a 
number of comparable interfering sources at proximal locations. 

It has been discovered that by suppressing one or more different frequency 
components in each of a plurality of interfering sources after localization, it is 
possible to reduce the interference from the noise sources in complex acoustic 
environments, such as in the case of multi-talkers, in spite of the temporal and 
frequency overlaps between talkers. Although a given frequency component or set of 
components may only be suppressed in one of the interfering sources for a given time 
frame, the dynamic allocation of suppression of each of the frequencies among the 
localized interfering acoustic sources generally results in better intelligibility of the 
desired signal than is possible by simply nulling only the most offensive source at all 
frequencies. 

Extraction operator 480 provides one implementation of this approach by 
utilizing localization information from localization operator 460 to identify Q 
interfering noise sources corresponding to positions other than / = s. The positions of 
the Q noise sources are represented by i=noisel, noise2,..., noiseQ. Notably, 
operator 480 receives the outputs of signal operator 350 as described in connection 

,. . . v, fi= noiseU /_,\ y n=noise2) , \ 

with system 3 10, that presents corresponding signals X„ (m), *n K™)> ■ ■ •» 

x ji=noiseQ) f()r each freque ncy m. These signals include a component of the desired 
signal at frequency m as well as components from sources other than the one to be 
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canceled. For the purpose of extraction and suppression, the equalization factors 
a«(/n) need not be set to unity once localization has taken place. To determine which 
frequency component or set of components to suppress in a particular noise source, 
the amplitudes of ^/ ,=noisc,) (m), X n °= noise2) (/»), X n a = noiseQ) (m) are calculated and 
5 compared. The minimum X n <inoise) (m), is taken as output S n (m) as defined by the 
following equation (36): 

S n {m)=X< inoise) {m\ (36) 

where, X n (inoise) (m) satisfies the condition expressed by equation (37) as follows: 

I X n (inoise > (m) | = min{ | X n (i ™' X) (m) | , | Xft (i ™ sc2) (m) | | xt noiseQ) (m) | , 

10 laMXjVfm)]}; - (37) 

for each value of m. It should be noted that, in equation (37), the original signal 
a s (m) X Ln (5) (m) is included. The resulting beam pattern may at times amplify other 
less intense noise sources. When the amount of noise amplification is larger than the 
amount of cancellation of the most intense noise source, further conditions may be 
15 included in operator 480 to prevent changing the input signal for that frequency at that 
moment. 

Processors 30, 330, 430 include one or more components that embody the 
corresponding algorithms, stages, operators, converters, generators, arrays, 
procedures, processes, and techniques described in the respective equations and signal 

20 flow diagrams in software, hardware, or both utilizing techniques known to those 

skilled in the art. Processors 30, 330, 430 may be of any type as would occur to those 
skilled in the art; however, it is preferred that processors 30, 330, 430 each be based 
on a solid-state, integrated digital signal processor with dedicated hardware to 
perform the necessary operations with a minimum of other components. 

25 Systems 3 10, 410 may be sized and adapted for application as a hearing aide 

of the type described in connection with Fig. 4 A. In a further hearing aid 
embodiment, sensors application 22, 24 are sized and shaped to fit in the pinnae of a 
listener, and the processor algorithms are adjusted to account for shadowing caused 
by the head and torso. This adjustment may be provided by deriving a Head-Related- 

30 Transfer-Function (HRTF) specific to the listener or from a population average using 
techniques known to those skilled in the art. This function is then used to provide 
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appropriate weightings of the dual delay stage output signals that compensate for 
shadowing. 

In yet another embodiment, system 310, 410 are adapted to voice recognition 
systems of the type described in connection with Fig. 4B. In still other embodiments, 
systems 310, 410 may be utilized in sound source mapping applications, or as would 
otherwise occur to those skilled in the art. 

It is contemplated that various signal flow operators, converters, functional 
blocks, generators, units, stages, processes, and techniques may be altered, 
rearranged, substituted, deleted, duplicated, combined or added as would occur to 
those skilled in the art without departing from the spirit of the present inventions. 

All publications and patent applications cited in this specification are herein 
incorporated by reference as if each individual publication or patent application were 
specifically and individually indicated to be incorporated by reference, including, but 
not limited to U.S. Patent Application Serial No. 08/666,757 filed on 19 June 1996. 
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EXPERIMENTAL SECTION 



The following experimental results are provided as nonlimited examples, and 
should not be construed to restrict the scope of the present invention. 

5 

EXAMPLE ONE 

A Sun Sparc-20 workstation was programmed to emulate the signal extraction 
process of the present invention. One loudspeaker (LI) was used to emit a speech 
signal and another loudspeaker (L2) was used to emit babble noise in a semi-anechoic 

10 room. Two microphones of a conventional type were positioned in the room and 
operatively coupled to the workstation. The microphones had an inter-microphone 
distance of about 15 centimeters and were positioned about 3 feet from LI. LI was 
aligned with the midpoint between the microphones to define a zero degree azimuth. 
L2 was placed at different azimuths relative to LI approximately equidistant to the 

15 midpoint between LI and L2. 

Referring to FIG. 5, a clean speech of a sentence about two seconds long is 
depicted, emanating from LI without interference from L2. FIG. 6 depicts a 
composite signal from LI and L2. The composite signal includes babble noise from 
L2 combined with the speech signal depicted in FIG. 5. The babble noise and speech 

20 signal are of generally equal intensity (OdB) with L2 placed at a 60 degree azimuth 
relative to LI. FIG. 7 depicts the signal recovered from the composite signal of FIG. 
6. This signal is nearly the same as the signal of FIG. 5. 

FIG. 8 depicts another composite signal where the babble noise is 30dB more 
intense than the desired signal of FIG. 5. Furthermore, L2 is placed at only a 2 degree 

25 azimuth relative to Li. FIG. 9 depicts the signal recovered from the composite signal 
of FIG. 8, providing a clearly intelligible representation of the signal of FIG. 5 
despite the greater intensity of the babble noise from L2 and the nearby location. 
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EXAMPLE TWO 

Experiments corresponding to system 410 were conducted with two groups having 
four talkers (2 male, 2 female) in each group. Five different tests were conducted for each 
group with different spatial configurations of the sources in each test. The four talkers were 
arranged in correspondence with sources 412, 414, 416, 418 of Fig. 14 with different values 
for angles 412a, 414a, 416a, and 418a in each test. The illustration in Fig. 14 most closely 
corresponds to the first test with angle 418a being -75 degrees , angle 412a being 0 degrees, 
angle 414a being +20 degrees, and angle 416a being +75 degrees. The coincident patterns 
612, 614, 616, and 618 of Fig. 18 also correspond to the azimuth positions of— 75 degrees, 0 
degrees, +20 degrees, and +75 degrees. 

The experimental set-up for the tests utilized two microphones for sensors 22, 24 with 
an inter-microphone distance of about 144mm. No diffraction or shadowing effect existed 
between the two microphones, and the inter-microphone intensity difference was set to zero 
for the tests. The signals were low-pass filtered at 6 kHz and sampled at a 12.8-kHz rate 
with 16-bit quantization. A Wintel-based computer was programmed to receive the 
quantized signals for processing in accordance with the present invention and output the test 
results described hereinafter. In the short-term spectral analysis, a 20-ms segment of signal 
was weighted by a Hamming window and then padded with zeros to 2048 points for DFT, 
and thus the frequency resolution was about 6Hz. The values of the time delay units T f (/=1, 
...,/) were determined such that the azimuth resolution of the dual delay-line was 0.5° 
uniformly, namely 7=361. The dual delay-line used in the tests was azimuth-uniform. The 
coincidence detection method was based on minimum magnitude differences. 

Each of the five tests consisted of four subtests in which a different talker was taken 
as the desired source. To test the system performance under the most difficult experimental 
constraint, the speech materials (four equally-intense spondaic words) were intentionally 
aligned temporally. The speech material was presented in free-field. The localization of the 
talkers was done using both the equation (30) and equation (33) techniques. 

The system performance was evaluated using an objective intelligibility-weighted 
measure, as proposed in Peterson, P.M., "Adaptive array processing for multiple 
microphone hearing aids," Ph.D. Dissertation . Dept. Elect. Eng. and Comp. Sci., MIT; Res. 
Lab. Elect. Tech. Rept. 541, MIT, Cambridge, MA (1989). and described in detail in Liu, 
C. and Sideman, S., "Simulation of fixed microphone arrays for directional hearing aids ," J. 
Acoust. Soc. Am. 100, 848-856 (1996). Specifically, intelligibility-weighted signal 
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cancellafon, noise cancellation, and net .^N*-*** -» 



were used. 



j ■ T,m« i IT III and IV of FlGs. 19-22, 
The experimental results are presented in Tables I. II, HI. ana 

•u a ■„ Tahle I of FIG 19 approximate integration over 
resnectivelv The five tests described in Table 1 ot nu. iy *vv 

c by utilising equation (30); an. includes »o male spears Ml , M2 and two 
female s^ers H, F, The five tests described inTable of F.G. 20 are the ^ * 
Tab,e I except ,ha, in t egra,io„ over frequency was approximated by equatton (33, The 
f, es ts des Led in Table III of FIG- 2, approximate integration over frequency by 

Lng equation (30); and includes two different male spears M3, M4 an two dtfferen, 
utilizing 4 i.TahlelVofFIG. 22 are the same as 

female speakers F3, F4. The five tests descnbed m Table IV 
Table m except that integration over frequency was approximated by equation (33). 

repres ntin- the degree of noise cancelation in dB of the desired source (ideally 0 dB, and 

Z 2 to the las, column shows a degree of cancelation of all the noise sources um P «, 
Zl, while the las, column gives the ne, intelligibility-weighted improvement (which 

considers both noise cancellation and loss in the desired stgnal). 

The results generally show cancelation in the intelligibility-weighted ~ - 

»- of about 3-1 1 dB. while degradation of the desired source was generally less than 

nsed in the tests. Similar results were obtained from six-talker experiments. GeneraU* 
Zo dB enhancement in the inability-weighted signal-.o-nois= ratto resulted wh« 
Lre were six equally loud, temporally aliped speech sounds originating from s,x dtfferen. 

foregoing description, the same is to be considered as iUustrative and not restnctivein 

described and that all changes, modifications, and equivalents that come w„h,n the sptn, 
, the invention defined by the following claims are desired to be protected. 
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CLAIMS 



What is claimed is: 

1 . A method, comprising: 

providing a first signal from a first acoustic sensor and a second signal from a second 
acoustic sensor spaced apart from the first acoustic sensor, the first signal and the second 
signal each corresponding to two or more acoustic sources, said acoustic sources including a 
plurality of interfering sources and a desired source; 

localizing the interfering sources from the first and second signals to provide a 
corresponding number of interfering source signals each corresponding to a different one of 
the interfering sources and each including a pluarlity of frequency components, the 
components each corresponding to a different frequency; and 

suppressing one or more different frequency components of each of the interfering 
source signals to reduce noise. 

2. The method of claim 1, wherein said suppressing includes extracting a desired signal 
representative of the desired source. 

3. The method of claim 2, wherein said extracting includes determining a minimum 
value as a function of the interfering signals. 

4. The method of claim 1, wherein said localizing includes filtering with a number of 
coincidence patterns each corresponding to one of a number of predetermined spatial 
positions relative to the first and second sensors, the patterns each providing phantom 
position information that varies with frequency relative to the one of the predetermined 
spatial positions. 

5. The method of claim 1, further comprising delaying the first and second signals with a 
different dual delay line for each of a number of frequencies to provide a corresponding 
number of delayed signals to perform said localizing. 

6. The method of claim 5, further comprising processing the delayed signals after said 
localizing to perform said suppressing. 
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7. The method of claim 6, further comprising: 

transforming the first and second signals from a time domain form to a frequency 
domain form in terms of the frequencies before said delaying; 

extracting a desired signal representative of the desired source, said extracting 
including said suppressing; 

transforming the desired signal from a frequency domain form to a time domain form; 

and 

generating an acoustic output representative of the desired source from the time 
domain form of the desired signal. 

8. The method of claim 5, wherein the interfering signals are each determined from a 
unique pair of the delayed signals as a ratio between a difference in magnitude of the unique 
pair of the delayed signals and a difference determined as a function of an amount of delay 
associated with each member of the unique pair of the delayed signals. 



9. A system, composing: 

a pair of spaced apart acoustic sensors each arranged to detect two or more differently 
located acoustic sources and correspondingly generate a pair of input signals, said acoustic 
sources including a desired source and a plurality of interfering sources; 

a delay operator responsive to said input signals to generate a number of delayed 
signals therefrom; 

a localization operator responsive to said delayed signals to localize said interfering 
sources relative to location of said sensors and provide a plurality of interfering source 
signals each representative of a corresponding one of said interfering sources, said 
interfering source signals each being represented in terms of a plurality of frequency 
components, said components each corresponding to a different frequency; 

an extraction operator responsive to said interfering source signals to suppress at least 
one of said frequency components of each of said interfering source signals and extract a 
desired signal corresponding to said desired source, said at least one of said frequency 
components being different for each of said interfering source signals; and 

an output device responsive to said desired signal to provide an output corresponding 
to said desired source. 
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10. The system of claim 9, wherein said localization operator includes a filter to localize 
said interfering sources relative to a number of positions, said filter being based on a 
different coincidence pattern of ambiguous positional information that varies with 
frequency for each of said positions. 

5 

1 1 . The system of claim 9, further comprising: 

an analog-to-digital converter responsive to said input signals to convert each of said 
input signals from an analog form to a digital form; 

a first transformation stage responsive to said digital form of said input signals to 
10 transform said input signals from a time domain form to a frequency domain form in terms 
of a plurality of discrete frequencies, said delay operator including a dual delay line for each 
of the frequencies; 

a second transformation stage responsive to said desired signal to transform said 
desired signal from a digital frequency domain form to a digital time domain form; and 
15 a digital-to-analog converter responsive to said digital time domain form to convert 

said desired signal to an analog output form for said output device. 

12. The system of claim 9, wherein said delay operator, said localization operator, and 
said extraction operator are provided by a solid state signal processing device. 

20 

13. The system of claim 9, wherein said desired source signal is determined as a function 
of said interfering signals. 

14. The system of claim 9, wherein said interfering source signals are each determined 
25 from a unique pair of said delayed signals. 

15. The system of claim 14, wherein said interfering signals each correspond to a ratio 
between a difference in magnitude of said unique pair of said delayed signals and a 
difference determined as a function of an amount of delay associated with each member of 

30 said unique pair of said delayed signals. 

16. The system of claim 9, wherein said output device is configured to provide an 
acoustic output representative of said desired source. 
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17. A method, comprising: 

positioning a first acoustic sensor and a second acoustic sensor to detect a plurality of 
differently located acoustic sources; 

generating a first signal corresponding to said sources with said first sensor and a 
second signal corresponding to said sources with said second sensor; 

. providing a number of delayed signal pairs from the first and second signals, the 
delayed signal pairs each corresponding to one of a number of positions relative to the first 
and second sensors; and 

localizing the sources as a function of the delayed signal pairs and a number of 
coincidence patterns, the patterns each corresponding to one of the positions and 
establishing an expected variation of acoustic source position information with frequency 
attributable to a source at the one of the positions. 

18. The method of claim 1 7, wherein the coincidence patterns each correspond to a 
number of relationships characterizing a variation of phantom acoustic source position with 
frequency, the relationships each corresponding to a different ambiguous phase multiple. 

19. The method of claim 1 8, further comprising determining the relationships for each of 
the coincidence patterns as a function of distance separating the first and second sensors. 

20. The method of claim 1 8, wherein the relationships each correspond to a secondary 
contour that curves in relation to a primary contour, the primary contour representing 
frequency invariant acoustic source position information determined from the delayed 
signal pair corresponding to the one of the positions. 

21. The method of claim 17, wherein said localizing includes filtering with the 
coincidence patterns to enhance true position information with phantom position 
information. 

22. The method of claim 21, wherein said localizing includes integrating over time and 
integrating over frequency. 



23. The method of claim 17, wherein the first sensor and second sensor are part of a 
hearing aid device and further comprising adjusting the delayed signal pairs with a head- 
related-transfer function. 

5 

24. The method of claim 17, further comprising: 
extracting a desired signal after said localizing; and 

suppressing a different set of frequency components for each of a selected number of 
the sources to reduce noise. 

10 

25. The method of claim 17, wherein the positions each correspond to an azimuth 
established relative to the first and second sensors and further comprising generating a map 
showing relative location of each of the sources. 

15 26. A system, comprising: 

a pair of spaced apart acoustic sensors each configured to generate a corresponding 
one of a pair of inputs signals, the signals being representative of a number of differently 
located acoustic sources; 

a delay operator responsive to said input signals to generate a number of delayed 
20 signals each corresponding to one of a number of positions relative to said sensors; 

a localization operator responsive to said delayed signals to determine a number of 
sound source localization signals from said delayed signals and a number of coincidence 
patterns, said patterns each corresponding to one of said positions and relating frequency 
varying sound source position information caused by ambiguous phase multiples to said one 
25 of said positions to improve sound source localization; and 

an output device responsive to said localization signals to provide an output 
corresponding to at least one of said sources. 

27. The system of claim 26, further comprising: 
30 an analog-to-digital converter responsive to said input signals to convert each of said 

input signals from an analog form to a digital form; and 

a first transformation stage responsive to said digital form of said input signals to 
transform said input signals from a time domain form to a frequency domain form in terms 
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of a plurality of discrete frequencies, said delay operator including a dual delay line for each 
of the frequencies. 



28. The system of claim 27, further comprising: 

5 an extraction operator responsive to said localization signals to extract a desired 

signal; 

a second transformation stage responsive to said desired signal to transform said 
desired signal from a digital frequency domain form to a digital time domain form; and 
a digital to analog converter responsive to said digital time domain form to convert 
10 said desired signal to an analog output form for said output device. 

29. The system of claim 26, wherein said output device is configured to provide a map of 
acoustic source locations. 

15 30. The system of claim 26, wherein said delay operator and said localization operator are 
defined by an integrated solid state signal processor. 

31. The system of claim 26, wherein said localization operator responds to said delay 
signals to determine a closest one of said positions for one of said sources as a function of at 
20 least one of said delayed signals corresponding to said closest one of said positions and at 
least two other of said delayed signals corresponding to other of said positions, said at least 
two other of said delayed signals being determined with a corresponding one of said 
coincidence patterns. 

25 32. A system, comprising: 

a pair of spaced apart acoustic sensors each generating a corresponding one of a pair 
of inputs signals, the signals each being representative of a number of differently located 
sound sources; 

a signal processor responsive to said sensors, said processor including: (a) a means 
30 for providing a number of delayed signals from said input signals, the delayed signals each 
corresponding to one of a number of positions relative to said first and second sensors; (b) a 
means for localizing each of said sound sources to one of said positions as a function of said 
delayed signals and a corresponding one of a number of patterns of frequency invariant data 
corresponding to one of said positions and frequency dependent data corresponding to at 
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least two other of said positions; (c) a means for suppressing a different frequency 
component of each of a selected number of said sources causing interference and for 
extracting a desired signal representative of one of said sources; and 

an output device responsive to said desired signal to provide an output corresponding 
to said one of said sources. 

33. The system of claim 32, wherein said processor includes a means for adjusting said 
delayed signals with a head-related-transfer- function. 
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ABSTRACT OF THE DISCLOSURE 



A desired acoustic signal is extracted from a noisy environment by generating 
a signal representative of the desired signal with a processor. The processor receives 
aural signals from two sensors each at a different location. The two inputs to the 
processor are converted from analog to digital format and then submitted to a discrete 
Fourier transform process to generate discrete spectral signal representations. The 
spectral signals are delayed by a number of time intervals in a dual delay line to 
provide a number of intermediate signals, each corresponding to a different spatial 
location relative to the two sensors. Locations of the noise source and the desired 
source are determined and the spectral content of the desired signal is determined 
from the intermediate signal corresponding to the noise source locations. Inverse 
transformation of the selected intermediate signal followed by digital to analog 
conversion provides an output signal representative of the desired signal. Techniques 
to localize multiple acoustic sources are also disclosed. Further, a technique to 
enhance noise reduction from multiple sources based on two-sensor reception is 
described. 
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BINAURAL SIGNAL PROCESSING SYSTEM AND METHOD 



5 BACKGROUND OF THE INVENTION 

The present invention is directed to the processing of acoustic signals, and more 
particularly, but not exclusively, relates to the separation of acoustic signals emanating from 

10 different sources by detecting a mixture of the acoustic signals at multiple locations. 

The difficulty of extracting a desired signal in the presence of interfering signals is a 
long-standing problem confronted by acoustic engineers. This problem impacts the design 
and construction of many kinds of devices such as systems for voice recognition and 
intelligence gathering. Especially troublesome is the separation of desired sound from 

15 unwanted sound with hearing aid devices. Generally, hearing aid devices do not permit 
selective amplification of a desired sound when contaminated by noise from a nearby 
source - particularly when the noise is more intense. This problem is even more severe 
when the desired sound is a speech signal and the nearby noise is also the result of speech 
(e.g. babble). As used herein, "noise" refers not only to random or non deterministic 

20 signals, but also to undesired signals and signals interfering with the perception of a desired 
signal. 

One attempted solution to this problem has been the application of a single, highly 
directional microphone to enhance directionality of the hearing aid receiver. This approach 
has only a very limited capability. As a result, spectral subtraction, comb filtering, and 
25 speech-production modeling have been explored to enhance single microphone performance. 
Nonetheless, these approaches still generally fail to improve intelligibility of a desired speech 
signal, particularly when the signal and noise source are in close proximity. 

Another approach has been to arrange a number of microphones in a selected spatial 
relationship to form a type of directional detection beam. Unfortunately, when limited to a 
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size practical for hearing aids, beam forming arrays also have limited capacity to separate 
signals which are close together - especially if the noise is more intense than a desired 
speech signal. In addition, in the case of one noise source in a less reverberant environment, 
the noise cancellation provided by the beam-former varies with the location of the noise 
5 source in relation to the microphone array. R. W. Stadler and W.M. Rabinowitz, QnJhe 
Potential of Fixed Arra ys for Hearing Aids, 94 Journal Acoustical Society of America 1332 
(September 1993), and W. Soede et ah, Development of a Directional Hearing Instrument 
Based on Array Technology, 94 Journal of Acoustical Society of America 785 (August 1993) 
are cited as additional background concerning the beam forming approach. 

10 Still another approach has been the application of two microphones displaced from each 

other to provide two signals to emulate certain aspects of the binaural hearing system 
common to humans and many types of animals. Although certain aspects of biologic 
binaural hearing are still not folly understood, it is believed that the ability to localize sound 
sources is based on evaluation of binaural time delays and sound levels across different 

15 frequency bands associated with each of the two sound signals. The localization of sound 
sources with systems based on these interaural time and intensity differences is discussed in 
W. Lindernann, Extension of a Binaural Cross-Correlation Model by Contralateral Inhibition 
I. Simulation of Lateralization for Stationary Signals . 80 Journal of the Acoustical Society 
of America 1608 (December 1986). Nonetheless, the separation of a desired signal from 

20 noise or interfering sound still presents a significant problem once the sound sources are 
localized. 

For example, the system set forth in Markus Bodden, Modeling Human Sound-Source 
Localization and the Cocktail-Party-Effect T 1 Acta Acustica 43 (February/April 1993) 
employs a Wiener filter including a windowing process in an attempt to derive a desired 
25 signal from binaural input signals once the location of the desired signal has been established. 
Unfortunately, this approach results in significant deterioration of desired speech fidelity. 
Also, the system has only been demonstrated to suppress noise of equal intensity to the 
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SUMMARY OF THE INVENTION 



One feature of the present invention is utilizing two sensors to provide corresponding 
5 binaural signals from which the relative separation of a first acoustic source from a second 
acoustic source may be established as a function of time, and the spectral content of a desired 
acoustic signal from the first source may be representatively extracted. One aspect of this 
feature is that the desired acoustic signal may be successfully extracted even if a nearby noise 
source is of greater relative intensity. 
10 Another feature of the present invention is detecting an acoustic excitation at a first 

location to provide a corresponding first signal and at a second location to provide a 
corresponding second signal. This excitation includes a desired acoustic signal from a first 
source and an interfering acoustic signal from a second source spaced apart from the first 
source. The second source is localized relative to the first source as a function of the first 
15 and second signals. A characteristic signal is generated which is representative of the desired 
acoustic signal during the localization. 

Still another feature is delaying the first and second signals by a number of time 
intervals to correspondingly establish a number of delayed first signals and a number of 
delayed second signals. A time increment corresponding to the separation of the first and 
20 second sources is determined by comparing the delayed first signals to the delayed second 
signals. An output signal representative of the desired signal is generated as a function of the 
time increment. Furthermore, a signal pair indicative of the location of the second source 
may be selected that has a first member selected from the delayed first signals and a second 
member from the delayed second signals. The output signal may be generated as a function 
25 of this signal pair. 

In yet another feature, a processing system utilizes a first and second sensor at different 
locations to provide a binaural representation of an acoustic signal which includes a desired 
signal emanating from a selected source and an interfering signal emanating from a 
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interfering source. A processor generates a discrete first spectral signal and a discrete 
second spectral signal from the sensor signals. The processor delays the first and second 
spectral signals by a number of time intervals to generate a number of delayed first signals 
and a number of delayed second signals and provide a time increment signal. The time 
increment signal corresponds to separation of the selected source from the noise source. The 
processor generates an output signal as a function of the time increment signal, and an output 
device responds to the output signal to provide a sensory output representative of the desired 
signal. 

Among the other features of the present invention is a system to position a first and 
second sensor relative to a first signal source with the first and second sensor being spaced 
apart from each other and a second signal source being spaced apart from the first signal 
source. A first signal is provided from the first sensor and a second signal is provided from 
the second sensor. The first and second signals each represent a composite acoustic signal 
including a desired signal from the first signal source and an unwanted signal from the 
second signal source. A number of spectral signals are established from the first and second 
signals as a function of a number of frequencies. Each of the spectral signals, such as those 
corresponding to outputs of a delay line, represent a different position relative to the first 
signal source. A member of the spectral signals representative of position of the second 
signal source is determined, and an output signal is generated from the member which is 
representative of the first signal. This feature facilitates extraction of a desired signal from a 
spectral signal determined as part of the localization of the interfering source. As a result, 
localization calculations constitute the bulk of the signal processing because, once localization 
of the interfering source is performed, the desired signal is estimated directly from one of the 
intermediate localization operands. This approach avoids the extensive post-localization 
computations required by many binaural systems. 

Accordingly, it is one object of the present invention to provide for the extraction of a 
desired acoustic signal from a noisy environment. 



Another object is to provide a device for the separation of acoustic signals by detecting 
a combination of these signals at two locations. This device may be used to aid impaired 
hearing. 

Further objects, features, and advantages of the present invention shall become 
apparent from the detailed drawings and descriptions provided herein. 



BRIEF DESCRIPTION OF THE DRAWINGS 



FIG. 1 is a diagrammatic view of a first embodiment of the present invention. 
5 FIG. 2 is a signal flow diagram of an extraction process performed by the embodiment 

of FIG. 1. 

FIG. 3 is schematic representation of the dual delay line of FIG. 2. 
FIGS. 4A and 4B depict other embodiments of the present invention corresponding to 
hearing aid and computer voice recognition applications, respectively. 
10 FIG. 5 is a graph of a speech signal in the form of a sentence about 2 seconds long. 

FIG. 6 is a graph of a composite signal including babble noise and the speech signal of 
FIG. 5 at a 0 dB signal-to-noise ratio with the babble noise source at about a 60 azimuth 
relative to the speech signal source. 

FIG. 7 is a graph of a signal representative of the speech signal of FIG. 5 after 
15 extraction from the composite signal of FIG. 6. 

FIG. 8 is a graph of a composite signal including babble noise and the speech signal of 
FIG. 5 at a -30 dB signal-to-noise ratio with the babble noise source at a 2 degree azimuth 
relative to the speech signal source. 

FIG. 9 is a graphic depiction of a signal representative of the sample speech signal of 
20 FIG. 5 after extraction from the composite signal of FIG. 8. 
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DESCRIPTION OF THE PREFERRED EMBODIMENT 



For the purposes of promoting an understanding of the principles of the invention, 
5 reference will now be made to the embodiment illustrated in the drawings and specific 
language will be used to describe the same. It will nevertheless be understood that no 
limitation of the scope of the invention is thereby intended. Any alterations and further 
modifications in the described device, and any further applications of the principles of the 
invention as described herein are contemplated as would normally occur to one skilled in the 
10 art to which the invention relates. 

Fig. 1 illustrates an acoustic signal processing system 10 of the present 
invention. System 10 is configured to extract a desired acoustic signal from source 12 
despite interference or noise emanating from nearby source 14. System 10 includes a 
pair of acoustic sensors 22, 24 configured to detect acoustic excitation that includes 
15 signals from sources 12, 14. Sensors 22, 24 are operatively coupled to processor 30 to 
process signals received therefrom, Also, processor 30 is operatively coupled to output 
device 90 to provide a signal representative of a desired signal from source 12 with 
reduced interference from source 14 as compared to composite acoustic signals 
presented to sensors 22, 24 from sources 12, 14. 
20 Sensors 22, 24 are spaced apart from one another by distance D along lateral 

axis T. Midpoint M represents the half way point along distance D from sensor 22 to 
sensor 24. Reference axis Rl is aligned with source 12 and intersects axis T 
perpendicularly through midpoint M. Axis N is aligned with source 14 and also 
intersects midpoint M. Axis N is positioned to form angle A with reference axis Rl. 
25 Fig. 1 depicts an angle A of about 20 degrees. Notably, reference axis Rl may be 
selected to define a reference azimuthal position of zero degrees in an azimuthal plane 
intersecting sources 12, 14; sensors 22, 24; and containing axes T, N, Rl. As a result, 



source 12 is a on-axis" and source 14, as aligned with axis N, is "off-axis." Source 14 
is illustrated at about a 20 degree azimuth relative to source 12. 

Preferably sensors 22, 24 are fixed relative to each other and configured to 
move in tandem to selectively position reference axis Rl relative to a desired acoustic 
signal source. It is also preferred that sensors 22, 24 be a microphones of a 
conventional variety, such as omnidirectional dynamic microphones. In other 
embodiments, a different sensor type may be utilized as would occur to one skilled in 
the art. 

Referring additionally to FIG. 2, a signal flow diagram illustrates various 
processing stages for the embodiment shown in FIG. 1. Sensors 22, 24 provide analog 
signals Lp(t) and Rp(t) corresponding to the left sensor 22, and right sensor 24, 
respectively. Signals Lp(t) and Rp(t) are initially input to processor 30 in separate 
processing channels L and R. For each channel L, R, signals Lp(t) and Rp(t) are 
conditioned and filtered in stages 32a, 32b to reduce aliasing, respectively. After filter 
stages 32a, 32b, the conditioned signals Lp(t), Rp(t) are input to corresponding Analog 
to Digital (A/D) converters 34a, 34b to provide discrete signals Lp(k), Rp(k), where k 
indexes discrete sampling events. In one embodiment, A/D stages 34a, 34b sample 
signals Lp(t) and Rp(t) at a rate of at least twice the frequency of the upper end of the 
audio frequency range to assure a high fidelity representation of the input signals. 

Discrete signals Lp(k) and Rp(k) are transformed from the time domain to the 
frequency domain by a short-term Discrete Fourier Transform (DFT) algorithm in 
stages 36a, 36b to provide complex-valued signals XLp(m) and XRp(m). Signals 
XLp(m) and XRp(m) are evaluated in stages 36a, 36b at discrete frequencies / m , where 
m is an index (m= 1 to m=M) to discrete frequencies, and index p denotes the short- 
term spectral analysis time frame. Index p is arranged in reverse chronological order 
with the most recent time frame being p = 1, the next most recent time frame being p 
= 2, and so forth. Preferably, frequencies M encompass the audible frequency range 



and the number of samples employed in the short-term analysis is selected to strike an 
optimum balance between processing speed limitations and desired resolution of 
resulting output signals. In one embodiment, an audio range of 0.1 to 6 kHz is 
sampled in A/D stages 34a, 34b at a rate of at least 12.5 kHz with 512 samples per 
short-term spectral analysis time frame. In alternative embodiments, the frequency 
domain analysis may be provided by an analog filter bank employed before A/D stages 
34a, 34b. It should be understood that the spectral signals XLp(m) and XRp(m) may 
be represented as arrays each having a lxM dimension corresponding to the different 
frequencies / m . 

Spectral signals XLp(m) and XRp(m) are input to dual delay line 40 as 
further detailed in FIG. 3. FIG. 3 depicts two delay lines 42, 44 each having N 
number of delay stages. Each delay line 42, 44 is sequentially configured with delay 
stages Di through D^r. Delay lines 42, 44 are configured to delay corresponding input 
signals in opposing directions from one delay stage to the next, and generally 
correspond to the dual hearing channels associated with a natural binaural hearing 
process. Delay stages D^ P2, D 3 , . . ., Dn_ 2 , DN-1. and Dn each delay an input 
signal by corresponding time delay increments Tj, T2, T3, . . ., tn_ 2 , *N-1» and T N> 
(collectively designated Tj ), where index i goes from left to right. For delay line 42, 
XLp(m) is alternatively designated XLp!(m). XLp^m) is sequentially delayed by time 
delay increments Tj, T2, T 3 , • - tN-2. T N-1. and Tjsj to produce delayed outputs at 
the taps of delay line 42 which are respectively designated XLp 2 (m), XLp 3 (m), 
Xlp 4 (m), . . ..XLpN-iOn), XLp N (m), and XLp N+1 (m); and collectively designated 
XLp'(m)). For delay line 44, XRp(m) is alternatively designated XRpN+ l( m ). 
XRp N+1 (m) is sequentially delayed by time delay increments T1.T2.T3,..., tn_2, 
tn_i, and Tn to produce delayed outputs at the taps of delay line 44 which are 

respectively designated: XRpN( m ), XRpN-l( m ) f XRp N "2(m) XLp3( m ), 

XLp 2 (m), and Xlp^m); and collectively designated XRpV)- The input spectral 
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signals and the signals from delay line 42, 44 taps are arranged as input pairs to 
operation array 46. A pair of taps from delay lines 42, 44 is illustrated as input pair P 
in FIG. 3. 

Operation array 46 has operation units (OP) numbered from 1 to N+l, depicted 
as OP1, OP2, OP3, OP4,..., OPN-2, OPN-1, OPN, OPN+1 and collectively 
designated operations OPi. Input pairs from delay lines 42, 44 correspond to the 
operations of array 46 as follows: OPipCLp^m), XRpl(m)], OP2[XLp 2 (m), 

XRp2(m)], OP3[XLp3(m), XRp3(m)], OP4[XLp4( m ), XRp4( m )] 

OPN-2[XLp(N-2) (m)> XRp(N-2)( m) ], OPN-1 [XLp(N-l)( m) , XRp(N-l)( m )], 
OPN[XLpN(m), XRpN( m )], and OPN+l[XLp(N+l)( m ), XRp(N+l)( m ) ]; where 
OPi[XLp'(m), XRp'(m)l indicates that OPi is determined as a function of input pair 
XLp'(m), XRp^m). Correspondingly, the outputs of operation array 46 are Xp^m), 
Xp2(m), Xp3(m), Xp4(m), .... Xp(N-2)( m)> Xp(N-D( m ), XpN( m ), and Xp(N+l) (m) 
(collectively designated Xpi(m)). 

For i - 1 to i <_ N/2, operations for each OPi of array 46 are determined in 
accordance with complex expression 1 (CE1) as follows: 

XLpi(m) - XRpi(m) 

Xpi(m) = — 

exp[-j27i(x i + . .. +x N/2 )/ m ] - exp027t(T ((N/2)+ 1) + • • • +t(N-i+ l)V m ] 

where expfargument] represents a natural exponent to the power of the argument, and 
imaginary number j is the square root of -1 . For i > ((N/2) + 1) to i = N+ 1 , 
operations of operation array 46 are determined in accordance complex expression 2 
(CE2) as follows: 

XLpi(m) - XRpKm) 

Xpi(m) = 

expD27i(t ((N/2)+1) +...+T (i . 1) )/ m ]^xp[-j27i(x (N . i+ 2)+...+ 
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where exp[argument] represents a natural exponent to the power of the argument, and 
imaginary number j is the square root of -1. For i = (N/2) + l, neither CE1 nor CE2 
is performed. 

An example of the determination of the operations for N = 4 (i=l to i=N+l) 
is as follows: 

i =1, CE1 applies as follows: 

XLpl(m) - XRpl(m) 
Xpl(m) = . ; 

exp[-j27r(T 1 +T 2 V m ] - exp027r(x 3 +x 4 )f m ) 

i = 2 <. (N/2), CE1 applies as follows: 

XLp 2 (m) - XRp2(m) 
Xp2( m ) = _ . 

exp[-j27i(T 2 y m ] - exp[j27t(x 3 y m ] 
i = 3: Not applicable, (N/2) < i ((N/2)+l); 
i = 4, CE2 applies as follows: 

. XLp 4 (m) - XRp4(m) 
Xp 4 (m) = ■ and, 

ex P rj27i(T 3 y m ]-exp[-j27i(T2y m ] 
i = 5, CE2 applies as follows: 

XLp5(m) - XRp5(m) 
Xp5(m) : . 

exp027r(T 3 +x 4 )f m ]-exp[-j2n(T 1 +x 2 )f m ] 

Referring to FIGS. 1-3, each OPi of operation array 46 is defined to be 
representative of a different azimuthal position relative to reference axis R. The 
"center" operation, OPi where i = ((N/2)+l), represents the location of the reference 
axis and source 12. For the example N=4, this center operation corresponds to i = 3. 



12 



This arrangement is analogous to the different interaural time differences associated 
with a natural binaural hearing system. In these natural systems, there is a relative 
position in each sound passageway within the ear that corresponds to a maximum "in 
phase" peak for a given sound source. Accordingly, each operation of array 46 
represents a position corresponding to a potential azimuthal or angular position range 
for a sound source, with the center operation representing a source at the zero azimuth 
- a source aligned with reference axis R. For an environment having a single source 
without noise or interference, determining the signal pair with the maximum strength 
may be sufficient to locate the source with little additional processing; however, in 
noisy or multiple source environments, farther processing may be needed to properly 
estimate locations. 

It should be understood that dual delay line 40 provides a two dimensional matrix 
of outputs with N+l columns corresponding to Xp^m), and M rows corresponding to 
each discrete frequency / m of XpKm). This (N+l)xM matrix is determined for each 
short-term spectral analysis interval p. Furthermore, by subtracting XRp l (m) from 
XLp ! (m), the denominator of each expression CE1, CE2 is arranged to provide a 
minimum value of Xp ! (m) when the signal pair is a in-phase w at the given frequency / m 
Localization stage 70 uses this aspect of expressions CE1, CE2 to evaluate the location 
of source 14 relative to source 12. 

Localization stage 70 accumulates P number of these matrices to determine the 
Xp^m) representative of the position of source 14. For each column i, localization 
stage 70 performs a summation of the amplitude of | Xpi(m) | to the second power over 
frequencies / m from m= 1 to m=M. The summation is then multiplied by the inverse 
of M to find an average spectral energy as follows: 

M 

Xavgpi = (1/M) 2 |Xp i (m)|2. 
m=l 



13 



The resulting averages, Xavgp* are then time averaged over the P most recent spectral- 
analysis time frames indexed by p in accordance with: 

P 

X* = E ypXavgpi, 
p=l 

where yp are empirically determined weighting factors. In one embodiment, the yp 
factors are preferably between 0.85 P and 0.90 P , where p is the short-term spectral 
analysis time frame index. The X* are analyzed to determine the minimum value, 
minQC 1 ). The index i of min(Xi), designated "I," estimates the column representing the 
azimuthal location of source 14 relative to source 12. 

It has been discovered that the spectral content of a desired signal from source 
12, when approximately aligned with reference axis Rl, can be estimated from Xp^m). 
In other words, the spectral signal output by array 46 which most closely corresponds 
to the relative location of the "off-axis" source 14 contemporaneously provides a 
spectral representation of a signal emanating from source 12. As a result, the signal 
processing of dual delay line 40 not only facilitates localization of source 14, but also 
provides a spectral estimate of the desired signal with only minimal post-localization 
processing to produce a representative output. 

Post-localization processing includes provision of a designation signal by 
localization stage 70 to conceptual "switch" 80 to select the output column Xp!(m) of 
the dual delay line 40. The Xp^m) is routed by switch 80 to an inverse Discrete 
Fourier Transform algorithm (Inverse DFT) in stage 82 for conversion from a 
frequency domain signal representation to a discrete time domain signal representation 
denoted as s(k). The signal estimate s(k) is then converted by Digital to Analog (D/A) 
converter 84 to provide an output signal to output device 80. 
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Output device 80 amplifies the output signal from processor 30 with amplifier 
92 and supplies the amplified signal to speaker 94 to provide the extracted signal from 
a source 12. 

It has been found that interference from off-axis sources separated by as little as 
2 degrees from the on axis source may be reduced or eliminated with the present 
invention - even when the desired signal includes speech and the interference includes 
babble. Moreover, the present invention provides for the extraction of desired signals 
even when the interfering or noise signal is of equal or greater relative intensity. By 
moving sensors 22, 24 in tandem the signal selected to be extracted may 
correspondingly be changed. Moreover, the present invention may be employed in an 
environment having many sound sources in addition to sources 12, 14. In one 
alternative embodiment, the localization algorithm is configured to dynamically respond 
to relative positioning as well as relative strength, using automated learning techniques. 
In other embodiments, the present invention is adapted for use with highly directional 
microphones, more than two sensors to simultaneously extract multiple signals, and 
various adaptive amplification and filtering techniques known to those skilled in the art. 

The present invention greatly improves computational efficiency compared to 
conventional systems by determining a spectral signal representative of the desired 
signal as part of the localization processing. As a result, an output signal characteristic 
of a desired signal from source 12 is determined as a function of the signal pair 
XLp^m), XRp!(m) corresponding to the separation of source 14 from source 12. 
Also, the exponents in the denominator of CE1, CE2 correspond to phase difference 
of frequencies / m resulting from the separation of source 12 from 14. Referring to 
the example of N=4 and assuming that 1= 1, this phase difference is 
-27i(T! +i2tfm (for delay line 42) and In (t 3 +%4)f m (for delay line 44) and 
corresponds to the separation of the representative location of off-axis source 14 from 
the on-axis source 12 at i=3. Likewise the time increments, Ti +T2 and T3+T4, 
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correspond to the separation of source 14 from source 12 for this example. Thus, 
processor 30 implements dual delay line 40 and corresponding operational relationships 
CE1, CE2 to provide a means for generating a desired signal by locating the position of 
an interfering signal source relative to the source of the desired signal. 

It is preferred that X\ be selected to provide generally equal azimuthal positions 
relative to reference axis R. In one embodiment, this arrangement corresponds to the 
values of x\ changing about 20% from the smallest to the largest value. In other 
embodiments, T[ are all generally equal to one another, simplifying the operations of 
array 46. Notably, the pair of time increments in the numerator of CE1, CE2 
corresponding to the separation of the sources 12 and 14 become approximately equal 
when all values X\ are generally the same. 

Processor 30 may be comprised of one or more components or pieces of 
equipment. The processor may include digital circuits, analog circuits, or a 
combination of these circuit types. Processor 40 may be programmable, an integrated 
state machine, or utilize a combination of these techniques. Preferably, processor 40 is 
a solid state integrated digital signal processor circuit customized to perform the 
process of the present invention with a minimum of external components and 
connections. Similarly, the extraction process of the present invention may be 
performed on variously arranged processing equipment configured to provide the 
corresponding functionality with one or more hardware modules, firmware modules, 
software modules, or a combination thereof. Moreover, as used herein, "signal" 
includes, but is not limited to, software, firmware, hardware, programming variable, 
communication channel, and memory location representations. 

Referring to FIG. 4A, one application of the present invention is depicted as 
hearing aid system 110. System 110 includes eyeglasses G with microphones 122 and 
124 fixed to glasses G and displaced from one another. Microphones 122, 124 are 
operatively coupled to hearing aid processor 130. Processor 130 is operatively coupled 
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to output device 190. Output device 190 is positioned in ear E to provide an audio 
signal to the wearer. 

Microphones 122, 124 are utilized in a manner similar to sensors 22, 24 of the 
embodiment depicted by FIGS 1-3. Similarly, processor 130 is configured with the 
signal extraction process depicted in of FIGS. 1-3. Processor 130 provides the 
extracted signal to output device 190 to provide an audio output to the wearer. The 
wearer of system 110 may position glasses G to align with a desired sound source, such 
as a speech signal, to reduce interference from a nearby noise source off axis from the 
midpoint between microphones 122, 124. Moreover, the wearer may select a different 
signal by realigning with another desired sound source to reduce interference from a 
noisy environment. 

Processor 130 and output device 190 may be separate units (as depicted) or 
included in a common unit worn in the ear. The coupling between processor 130 and 
output device 190 may be an electrical cable or a wireless transmission. In one 
alternative embodiment, sensors 122, 124 and processor 130 are remotely located and 
are configured to broadcast to one or more output devices 190 situated in the ear E via 
a radio frequency transmission or other conventional telecommunication method. 

FIG. 4B shows a voice recognition system 210 employing the present invention as a 
front end speech enhancement device. System 210 includes personal computer C with two 
microphones 222, 224 spaced apart from each other in a predetermined relationship. 
Microphones 222, 224 are operatively coupled to a processor 230 within computer C. 
Processor 230 provides an output signal for internal use or responsive reply via speakers 
294a, 294b or visual display 296. An operator aligns in a predetermined relationship with 
microphones 222, 224 of computer C to deliver voice commands. Computer C is configured 
to receive these voice commands, extracting the desired voice command from a noisy 
environment in accordance with the process system of FIGS. 1-3. 
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All publications and patent applications cited in this specification are herein 
incorporated by reference as if each individual publication or patent application were 
specifically and individually indicated to be incorporated by reference. 



EXPERIMENTAL SECTION 

The following experimental results are provided as nonlimiting examples, and 
should not be construed to restrict the scope of the present invention. 

A Sun Sparc-20 workstation was programmed to emulate the signal extraction 
process of the present invention. One loudspeaker (LI) was used to emit a speech 
signal and another loudspeaker (L2) was used to emit babble noise in a semi-anechoic 
room. Two microphones of a conventional type where positioned in the room and 
operatively coupled to the workstation. The microphones had an inter-microphone 
distance of about 15 centimeters and were positioned about 3 feet from LI. LI was 
aligned with the midpoint between the microphones to define a zero degree azimuth. 
L2 was placed at different azimuths relative to LI approximately equidistant to the 
midpoint between LI and L2. 

Referring to FIG. 5, a clean speech of a sentence about two seconds long is 
depicted, emanating from LI without interference from L2. FIG. 6 depicts a 
composite signal from LI and L2. The composite signal includes babble noise from L2 
combined with the speech signal depicted in FIG. 5. The babble noise and speech 
signal are of generally equal intensity (OdB) with L2 placed at a 60 degree azimuth 
relative to LI. FIG. 7 depicts the signal recovered from the composite signal of FIG. 
6. This signal is nearly the same as the signal of FIG. 5. 
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FIG. 8 depicts another composite signal where the babble noise is 30dB more 
intense than the desired signal of FIG. 5. Furthermore, 12 is placed at only a 2 degree 
azimuth relative to LI. FIG. 9 depicts the signal recovered from the composite signal 
of FIG. 8, providing a clearly intelligible representation of the signal of FIG. 5 despite 
the greater intensity of the babble noise from L2 and the nearby location. 

While the invention has been illustrated and described in detail in the drawings and 
foregoing description, the same is to be considered as illustrative and not restrictive in 
character, it being understood that only the preferred embodiment has been shown and 
described and that all changes and modifications that come within the spirit of the invention 
are desired to be protected. 
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We claim: 



1. A method of signal processing, comprising: 

(a) detecting an acoustic excitation at a first location to provide a corresponding 
first signal and at a second location to provide a corresponding second signal, the excitation 
including a desired acoustic signal from a first source and an interfering acoustic signal from 
a second source spaced apart from the first source; 

(b) localizing the second source relative to the first source as a function of the first 
and second signals; and 

(c) generating a characteristic signal representative of the desired acoustic signal 
during performance of said localizing. 

2. The method of claim 1, wherein the characteristic signal corresponds to spectral 
content of the desired acoustic signal and further comprising providing an output signal 
representative of the desired acoustic signal as a function of the characteristic signal. 

3 . The method of claim 1 , wherein said localizing includes : 

(bl) delaying each of the first and second signals by a number of time intervals to 
provide a number of delayed first signals and a number of delayed second signals; and 

(b2) determining a time interval representative of separation of the first source from 
the second source, the characteristic signal being a function of the time interval. 

4. The method of claim 1 , wherein said localizing includes : 

(bl) delaying each of the first and second signals by a number of time intervals to 
provide a number of delayed first signals and a number of delayed second signals; and 
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(b2) establishing a signal pair, the signal pair having a first member from the 
delayed first signals and a second member from the delayed second members, the 
characteristic signal being determined from the signal pair. 

5. The method of claim 1, further comprising providing an output signal representative of 
the desired acoustic signal, and wherein the desired acoustic signal includes speech and the 
output signal is provided by a hearing aid device. 

6. The method of claim 1, wherein said localizing further includes: 

(bl) converting the first and second signals from an analog representation to a 
discrete representation; 

(b2) transforming the first and second signals from a time domain representation to a 
frequency domain representation; 

(b3) delaying each of the first and second signals by a number of time intervals to 
provide a number of delayed first signals and a number of delayed second signals; and 

(b4) establishing a first time increment and a signal pair each representative of 
separation of the first source from the second source, the signal pair having a first member 
from the delayed first signals and a second member from the delayed second members. 

7. The method of claim 6, wherein the characteristic signal corresponds to a fraction with 
a numerator determined from at least the first and second members, and a denominator 
determined from at least the first time increment. 

8. The method of claim 6, wherein said generating further includes: 

(cl) determining the characteristic signal from the signal pair and the first time 
increment, the characteristic signal being representative of spectral content of the desired 
acoustic signal; 
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(c2) transforming the characteristic signal from a frequency domain representation to 
a time domain representation; 

(c3) converting the characteristic signal from a discrete representation to an analog 
representation; and 

(c4) providing an audio output signal representative of the desired acoustic signal as 
a function of the characteristic signal. 

9. The method of claim 8, further comprising establishing a second time increment 
corresponding to separation of the first source from the second source by comparing the 
delayed first and second signals, and 

wherein the first time increment corresponds to a first phase difference, the second 
time increment corresponds to a second phase difference, and the characteristic signal 
includes a spectral representation determined from at least the first and second phase 
differences. 

10. The method of claim 1, wherein the desired acoustic signal has an intensity greater than 
the interfering acoustic signal when the first and second sources are each generally 
equidistant from a midpoint between the first and second locations. 

11. The method of claim 1 , wherein separation of the second source is within five degrees 
of the first source relative to a zero degree azimuthal reference axis intersecting the first 
source and a midpoint situated between the first and second locations. 

12. The method of claim 1, further comprising: 

(d) establishing a number of location signals, each corresponding to a different 
location relative to the first source; and 
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(e) selecting the characteristic signal from the location signals, the 
characteristic signal being representative of location of the second source relative to the 
first source, the characteristic signal including a spectral representation of the desired 
acoustic signal; 

13. A signal processing system, comprising: 

(a) a first sensor at a first location configured to provide a first signal 
corresponding to an acoustic signal, said acoustic signal including a desired signal emanating 
from a selected source and noise emanating from a noise source; 

(b) a second sensor at a second location configured to provide a second signal 
corresponding to said acoustic signal; 

(c) a signal processor responsive to said first and second signals to generate a 
discrete first spectral signal corresponding to said first signal and a discrete second spectral 
signal corresponding to said second signal, said processor being configured to delay said first 
and second spectral signals by a number of time intervals to generate a number of delayed 
first signals and a number of delayed second signals and provide a time increment signal, 
said time increment signal corresponding to separation of the selected source from the noise 
source, and said processor being further configured to generate an output signal as a function 
of said time increment signal; and 

(d) an output device responsive to said output signal to provide an output 
representative of said desired signal. 

14. The system of claim 13 , wherein said first and second sensors each include a 
microphone and said output device includes an audio speaker. 

15. The system of claim 13, wherein said processor includes an analog to digital 
conversion circuit configured to provide said discrete first spectral signal. 
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16. The system of claim 13, wherein generation of said first and second spectral signals 
includes execution of a discrete fourier transform algorithm. 

17. The system of claim 13, wherein said first and second sensors are configured for 
movement to select said desired signal in accordance with position of said first and second 
sensors, said first and second sensors being configured to be spatially fixed relative to each 
other. 

18. The system of claim 13, wherein each of said delayed first signals correspond to one of 
a number of first taps from a first delay line, and each of said delayed second signals 
correspond to one of a number of second taps from a second delay line. 

19. The system of claim 18, wherein determination of said output signal corresponds to: 
said first and second delay lines being configured in a dual delay line configuration; 
said discrete first spectral signal being input to said first delay line and said discrete 

second spectral signal being input to said second delay line; and 

each of said first taps, said second taps, and said first and second spectral signals being 
arranged as a number of signal pairs, said signal pairs including a first portion of signal pairs 
and a second portion of signal pairs, said processor being configured to perform a first 
operation on each of said signal pairs of said first portion as a function of said time intervals, 
said processor being configured to perform a second operation on each of said signal pairs of 
said second portion as a function of said time intervals, said first operation being different 
from said second operation. 
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20. A signal processing system, comprising: 

(a) a first sensor configured to provide a first signal corresponding to an acoustic 
excitation, said excitation including a first acoustic signal from a first source and a second 
acoustic signal from a second source displaced from the first source; 

(b) a second sensor displaced from said first sensor and configured to provide a 
second signal corresponding to said excitation; 

(c) a processor responsive to said first and second sensor signals, said processor 
including a means for generating a desired signal having a spectrum representative of said 
first acoustic signal; and 

(d) an output means for generating a sensory output in response to said desired 
signal. 

21. The system of claim 20, wherein said first and second sensors each include a 
microphone and said output device includes an audio speaker. 

22. The system of claim 20, wherein said generating means includes executing a discrete 
fourier transform algorithm. 

23. The system of claim 20, wherein said processor includes an analog to digital 
conversion circuit and a digital to analog conversion circuit. 

24. The system of claim 20, wherein each of said delayed first signals correspond to one of 
a number of first taps from a first delay line, and each of said delayed second signals 
correspond to one of a number of second taps from a second delay line. 

25. The system of claim 20, wherein said first and second sensors are configured for 
movement to select said desired signal in accordance with position of said first and second 
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sensors, said first and second sensors being configured to be spatially fixed relative to each 
other. 

26. A method of signal processing, comprising: 

(a) positioning a first and second sensor relative to a first signal source, the first 
and second sensor being spaced apart from each other, and a second signal source being 
spaced apart from the first signal source; 

(b) providing a first signal from the first sensor and a second signal from the 
second signal, the first and second signals each being representative of a composite acoustic 
signal including a desired signal from the first signal source and an unwanted signal from the 
second signal source; 

(c) establishing a number of spectral signals from the first and second signals as a 
function of a number of frequencies, each of the spectral signals representing a different 
position relative to the first signal source; 

(d) determining a member of the spectral signals representative of position of the 
second signal source; and 

(e) generating an output signal from the member, the output signal being 
representative of spectral content of the first signal. 

27. The method of claim 26, wherein the member is determined as a function of a phase 
difference value for a number of frequencies delayed by a first amount and a second amount. 

28. The method of claim 26, wherein the desired signal includes speech and the output 
signal is provided by a hearing aid device. 

29. The method of claim 26, further comprising repositioning the first and second sensors 
to extract a third signal from a third signal source. 
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30. The method of claim 26, wherein said establishing includes: 

(al) delaying each of the first and second signals by a number of time intervals to 
generate a number of delayed first signals and a number of delayed second signals; and 

(a2) comparing each of the delayed first signals to a corresponding one of the 
delayed second signals, each of the spectral signals being a function of at least one of the 
delayed first and second signals. 
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ABSTRACT OF THE DISCLOSURE 



A desired acoustic signal is extracted from a noisy environment by generating a 
signal representative of the desired signal with a processor for a hearing aid device. 
The processor receives binaural signals from two microphones at different locations. 
The binaural inputs to the processor are converted from analog to digital format and 
then submitted to a discrete Fourier transform process to generate discrete spectral 
signal representations. The spectral signals are delayed by a number of time intervals 
in a dual delay line to provide a number of intermediate signals, each corresponding to 
a different position relative to a desired signal source. Location of the noise source is 
determined and the spectral content of the desired signal is determined from the 
intermediate signal corresponding to the noise source location. Inverse transformation 
of the selected intermediate signal followed by digital to analog conversion provides an 
output signal representative of the desired signal. 
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Patent Claims 

1.-33. (Canceled). 

34. (Previously presented) A method of signal processing, comprising: 

(a) detecting an acoustic excitation at both a first location to provide a 

corresponding first signal and at a second location to provide a corresponding second 
signal, the excitation being a composite of a desired acoustic signal from a first source 
and an interfering acoustic signal from a second source spaced apart from the first source; 

(b) determining location of the second source relative to the first source as a 
function of the first and second signals, which includes delaying each of the first and 
second signals by several time intervals to provide several delayed first signals and 
several delayed second signals and providing a time increment representative of 
separation of the first source from the second source; and 

(c) generating a characteristic signal representative of the desired acoustic 
signal during performance of said determining, the characteristic signal being a function 
of the time increment. 

35. (Previously presented) The method of claim 34, wherein the characteristic signal 
corresponds to spectral content of the desired acoustic signal and further comprising 
providing an output signal representative of the desired acoustic signal as a function of 
the characteristic signal. 

36. (Canceled). 
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37. (Currently amended) The method of claim 34, wherein said determining includes: 

establishing a signal pair, the signal pair having a first member from the delayed 
first signals and a second member from the delayed second signals, the characteristic 
signal being determined from the signal pair. 

38. (Previously presented) The method of claim 34, further comprising providing an 
output signal representative of the desired acoustic signal, and wherein the desired 
acoustic signal includes speech and the output signal is provided by a hearing aid device. 

39. (Currently amended) The method of claim 34, wherein said determining further 
includes: 

(bl) converting the first and second signals from an analog representation to a 
discrete representation; 

(b2) transforming the first and second signals from a time domain 
representation to a frequency domain representation; and 

(b3) establishing a signal pair representative of separation of the first source from 
the second source, the signal pair having a first member from the delayed first signals and 
a second member from the delayed second signals. 

40. (Currently amended) The method of claim 39, wherein the characteristic signal 
corresponds to a fraction with a numerator determined from at least the first and second 
members, and a denominator determined from at least the time increment. 

Patent Claim List 
Application No.: 09/193,058 
Inventors: Feng, et al. 
Filed: November 16, 1998 
Page 2 of 10 



22010-127/321877 

41. (Previously presented) The method of claim 39, wherein said generating further 
includes: 

(cl) determining the characteristic signal from the signal pair and the first time 
increment, the characteristic signal being representative of spectral content of the desired 
acoustic signal; 

(c2) transforming the characteristic signal from a frequency domain 
representation to a time domain representation; and 

(c3) providing an audio output signal representative of the desired acoustic 
signal as a function of the characteristic signal. 

42. (Currently amended) The method of claim 41, further comprising establishing a 
further time increment corresponding to separation of the first source from the second 
source by comparing the delayed first and second signals, and 

wherein the time increment corresponds to a first phase difference, the further 
time increment corresponds to a second phase difference, and the characteristic signal 
includes a spectral representation determined from at least the first and second phase 
differences. 

43. (Canceled). 

44. (Previously presented) The method of claim 34, wherein separation of the second 
source is within five degrees of the first source relative to a zero degree azimuthal 
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reference axis intersecting the first source and a midpoint situated between the first and 
second locations. 

45. (Previously presented) The method of claim 34, further comprising; 

(d) establishing a number of location signals each corresponding to a different 
location relative to the first source; and 

(e) selecting the characteristic signal from the location signals, the 
characteristic signal being representative of the location of the second source relative to 
the first source, the characteristic signal including a spectral representation of the desired 
acoustic signal. 

46. (Previously presented) A method of signal processing, comprising: 

(a) detecting an acoustic excitation at a first location to provide a 
corresponding first signal and at a second location to provide a corresponding second 
signal, the excitation being a composite of a desired acoustic signal from a first source 
and an interfering acoustic signal from a second source spaced apart from the first source; 

(b) localizing the second source relative to the first source as a function of the 
first and second signals, said localizing including establishing a number of location 
signals each corresponding to a different location relative to the first source, delaying 
each of the first and second signals by a number of time intervals to provide a number of 
delayed first signals and a number of delayed second signals, and establishing a signal 
pair that has a first member from the delayed first signals and a second member from the 
delayed second signals; and 
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(c) generating a characteristic signal from the location signals, wherein the 
characteristic signal includes a spectral representation of the desired acoustic signal from 
the first source, corresponds to position of the second source, and is determined from the 
signal pair. 

47. (Previously presented) The method of claim 46, further comprising providing an 
output signal representative of the desired acoustic signal as a function of the 
characteristic signal. 

48. (Currently amended) The method of claim 46, wherein said localizing includes: 

determining a time increment representative of separation of the first source from 
the second source, the characteristic signal being a function of the time increment. 

49. (Canceled). 

50. (Previously presented) The method of claim 46, further comprising providing an 
output signal representative of the desired acoustic signal, and wherein the desired 
acoustic signal includes speech and the output signal is provided by a hearing aid device. 

51. (Currently amended) The method of claim 46, wherein said localizing further 
includes: 

(bl) converting the first and second signals from an analog representation to a 
discrete representation; 
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(b2) transforming the first and second signals from a time domain 
representation to a frequency domain representation; and 

(b3) establishing a first time increment and a signal pair each representative of 
separation of the first source from the second source, the signal pair having a first 
member from the delayed first signals and a second member from the delayed second 
signals. 

52. (Previously presented) The method of claim 51, wherein the characteristic signal 
corresponds to a fraction with a numerator determined from at least the first and second 
members, and a denominator determined from at least the first time increment. 

53. (Previously presented) The method of claim 51, wherein said generating further 
includes: 

(cl) determining the characteristic signal from the signal pair and the first time 
increment; 

(c2) transforming the characteristic signal from a frequency domain 
representation to a time domain representation; and 

(c3) providing an audio output signal representative of the desired acoustic 
signal as a function of the characteristic signal. 

54. (Previously presented) The method of claim 53, further comprising establishing a 
second time increment corresponding to separation of the first source from the second 
source by comparing the delayed first signals and delayed second signals, and 
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wherein the first time increment corresponds to a first phase difference, the 
second time increment corresponds to a second phase difference, and the spectral 
representation of the characteristic signal is determined from at leas, the first and second 
phase differences. 

55. (Canceled). 

56. (Previously presented) The method of claim 1, wherein separation of the second 
source is within five degrees of the firs, source relative to a zero degree azimuthal 
reference axis intersecting the first source and a midpoint situated between the first and 

second locations. 

57. (New) The method of claim 34, wherein the characteristic signal corresponds to a 
fraction w.th a numerator determined from a difference between a first member of the 
delayed first signals and a second member of the delayed second signals, and a 
denominator determined from at least the time increment. 

58. (New) The method of claim 57, which includes providing the delayed first signals 
from a first multistage delay line and the delayed second stgnals from a second multistage 
delay line, the firs, member being output by a stage of the first delay line corresponding 
t0 the location of the second source and the second member being output by a stage of the 
second delay line corresponding to the location of the second source, and a different stage 
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of each of the first delay line and the second delay line corresponding to location of the 
first source. 

59. (New) The method of claim 58, wherein the difference is representative of a 
minimized interfering acoustic signal level and provides the characteristic signal 
representative of spectral content of the desired acoustic signal. 

60. (New) The method of claim 46, wherein the generating includes determining the 
characteristic signal as a fraction with a numerator being a function of a difference 
between one of the delayed first signals and one of the delayed second signals, the 
difference being representative of a minimized interfering acoustic signal level, and the 
fraction having a denominator determined as a function of at least the first time 
increment. 

61. (New) A method of signal processing, comprising: 

detecting an acoustic excitation at both a first location to provide a corresponding first 
signal and at a second location to provide a corresponding second signal, the excitation being a 
composite of a desired acoustic signal from a first source and an interfering acoustic signal 
from a second source spaced apart from the first source; 

incrementally delaying the first signal to provide a number of delayed first signals and the 
second signal to provide a number of delayed second signals, a number of different pairings of 
the delayed first signals and the delayed second signals representing different locations; 
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localizing the second source relative to one of the different locations as a taction of a 
difference between the members of a corresponding one of the different pairings; and 

generating a characteristic signal representative of spectral content of the desired acoustic 
signal based on the difference and a time increment corresponding to distance separating the 
first source and the second source. 

62. (New) A method of signal processing, comprising: 

detecting an acoustic excitation at both a first location to provide a corresponding first 
signal and at a second location to provide a corresponding second signal, the excitation being a 
composite of a desired acoustic signal from a first source and.an interfering acoustic signal 
from a second source spaced apart from the first source; 

selecting the desired acoustic signal by positioning a reference axis relative to the first 



source; 



localizing the second source relative to the reference axis as a function of the first and 
second signals; and 

generating a characteristic signal representative of the desired acoustic signal during 
performance of said localizing. 



63. (New) The method of claim 62, which includes: 

defining the reference axis relative to the first location and the second location; and 
moving the reference axis to select a different acoustic signal. 
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64. (New) The method of claim 63, wherein the detecting the acoustic excitation is 
performed with a first sensor at the first location and a second sensor at the second location. 

65. (New) The method of claim 63, wherein the method is performed with a hearing aid. 

66. (New) The method of claim 63, wherein: 

the localizing includes establishing a number of delayed first signals each corresponding 
to a different one of a number of first delay stages of a first delay line and a number of delayed 
second signals each corresponding to a different one of a number of second delay stages of a 
second delay line; and 

the generating includes determining the characteristic signal as a function of a fraction 
with a numerator corresponding to a difference between one output of the first delay stages and 
one output of the second delay stages and a denominator corresponding to a time increment 
representative of a distance separating the first source and the second source. 
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CONFIDENTIAL <Yv ^ *, 

INVENTION DISCLOSURE ^ S 

An invention may be any new product, process (method or use) or composition, or improvement thereof. To be patentable 
or protectable an invention must be novel (as compared to the existing state of the art), useful and non-obvious to a person 
with ordinary skill in the art. The creator should consider these standards when answering the questions outlined below. 
Complete, detailed answers will help with the evaluation and management of your invention and will serve to protect 
patent rights until an application for patent may be filed. 



1. TITLE OF INVENTION: 

New dual-microphone-based signal extraction algorithm that can acquire an acoustic signal 
faithfully in the presence of an intense noise originating from a nearby source. 

2. CREATOR'S NAME(S): 

Full name of those who worked on the invention (from whom the attorney can determine who may, under the law, 
be the creator or co-creator). 

First Creator: 

Name: Albert S.Feng \ 

Addr-work Beckman Institute for Advanced Science and Technology, Univ. Dlinois 

Phone 217-244-1951 

Addr-Home 1 209 Wilshire Court, Champaign, DL 61821 

Phone 217-359-7387 

Second Creator (If any): 

Name: Charissa R. Lansing 

Addr-work Beckman Institute for Advanced Science and Technology, Univ. Illinois 
Phone 217-244-2539 

Addr-home 2903 Valley Brook Drive, Champaign, DL 61821 
Phone 217-355-8281 

Third Creator (If any): 

Name: Chen Liu 

Addr-work Beckman Institute for Advanced Science and Technology, Univ. Illinois 
Phone 217-244-3067 

Addr-home 2105B Orchard Street, Urbana, IL 61801 
Phone 217-337-0285 



Disci of Inv 
Page 2 



Fourth Creator (If any): 
Name: \ 
Addr-work E 
Phone 2 
Addr-home 2 
Phone 2 



William D. O'Brien 




217-333-2407 



2002 O'Donnell Drive, Champaign, IL 61821 
217-359-7128 



Fifth Creator (If any): 
Name: 
Addr-work 
Phone 
Addr-home 
Phone 



1203 Waverly Drive, Champaign, IL 61821 
217-359-0527 



Bruce C. Wheeler 



Beckman Institute for Advanced Science and Technology, Univ. Illinois 



217-333-3236 



3. Provide a general summary of the subject matter of the invention in fifty (50) words or less. 
What is the purpose of the invention? Is it a new product, process, or composition of matter? 
A new use for or improvement to an existing product, process or composition of matter? 

A new neurally-inspired dual-microphone-based acoustic signal extraction algorithm has been 
invented. The algorithm can be incorporated into the front-end of a hearing-aid system or an intelligent 
gathering system that can effectively extract an acoustic signal from a specified direction in the presence 
of more intense noise (up to 30 dB more intense) originating from a different source that is physically 
segregated from the signal source by as little as 2\ 



For answers to the following questions, use remainder of sheet and attach extra sheets as necessary. 

4. Provide a full and complete description of the closest known prior methods or apparatus and any disadvantages or 
problems of each that are solved by the present invention. As to each problem identified, provide a paragraph or so 
indicating: 

(a) How long the problem existed and/or how long has it been appreciated to be a problem: 

The problem of extracting a desired signal in the presence of intense noise is a long-standing 
problem in audiology and engineering. Most hearing aid devices available in the market today do not 
permit selective amplification of a desired signal when the signal is contaminated by noise from a nearby 
source (particularly if the noise is more intense than the signal). Improving the design of hearing aids, 
including increased directionality and noise suppression is one of the priorities for the National Institute 
on Deafness and other Communication Disorders of the National Institute of Health (as stated in the 
1992 National Strategic Research Plan). A similar unresolved problem has existed for a very long time 
in the field of engineering acoustics. Namely, how can a sound that is heavily masked by intense noise 
from a nearby source be captured with high fidelity. 
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The following strategies are most commonly employed to solve this problem: use of microphone 
arrays (i.e., beam forming array) or a single highly directional microphone to enhance the directionality 
of the receiver; use of digital signal processors or an adaptive signal processor; combined use of 
directional microphone and adaptive beam-forming; use of personalized binaural directional hearing 
apparatus (to restore the directional hearing capacity of hearing impaired individuals); use of an adaptive 
filter; use of multi-channel compression hearing aids; use of a binaural localization scheme to locate the 
directions of signal and noise sources and a cross-correlation based processing or Wiener filter to obtain 
estimates of the desired signal; use of formant enhancement algorithms or amplification of frication (i.e., 
contrast enhancement algorithms). 

As concluded by Lim and Oppenheim (1979), all the single-microphone speech-enhancement 
methods such as spectral subtraction, comb filtering, and speech-production modeling, fail to improve 
the intelligibility of the desired speech. 

While the optimum beam-forming approach was shown to improve the intelligibility of the 
desired speech (Stadler and Rabinowitz, 1993; Soede, et al., 1993; Kates, 1993), this ability is greatly 
reduced when the speech source and the noise source are close to one another (<25°). This is due to the 
wide main beam which in turn resulted from the limited dimension of the microphone array. In addition, 
in the case of one noise source in a less reverberant environment, the noise cancellation provided by the 
beam-former varies with the location of the noise source with respect to the microphone array. 

One two-microphone approach proposed by Bodden (1993) performs noise cancellation by 
means of Wiener filtering. However, since the spectra of the desired speech and the noise are not 
available, the estimate of the Wiener filter based on cross-correlation inevitably results in a deterioration 
of the fidelity of the desired speech signal (i.e., a portion of the speech signal is removed by the process 
of noise cancellation). This method can only suppress noise of equal intensity to that of a speech signal 
and the example given is for an angular separation of 30°. In addition, the approach was proposed as a 



Section 4(b) redacted 



(c) How the invention solves or reduces each such problem. 

The approach taken is to use a neurally-inspired binaural localization scheme to locate the 
direction of the noise source and a proprietary mathematical algorithm to extract the desired signal at 0° 
azimuth. 



Attach any materials, such as publications, advertisements, patents, etc., you have or that are reasonably available to you 
concerning the known prior methods or apparatus. 
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5. State the advantages of your invention over what has been done before, the problems it solves, or new applications 
achieved. Indicate any disadvantages or limitations and explain how they might be overcome. 

The most significant advantages of the invention are the ability of the algorithm to extract a signal 
that is many times weaker than the noise (up to S/N of -30 dB) and the exceptional fidelity of the signal 
extracted. Furthermore, the required physical separation of the sources of the signal and noise is minute 
(as low as 2°). In other words, the directionality of the system (i.e., beam-forming characteristic) is very 
steep. By comparison, the existing algorithms are limited to extracting a signal when the noise is 
relatively much less intense (up to S/N of -6 to -10 dB) and require that the signal and noise sources be 
widely separated spatially (>30°). Thus, the new design represents a major improvement in terms of the 
spatial resolution and permits a far more extensive range of masking by noise. / 



6. State in general terms the purposes of the invention. Is it a new product, process, or composition of matter? Is it a 
new use for or improvement to an existing product, process or composition of matter? 

The purpose of the invention is to significantly improve the performance of currently available 
hearing aid devices. The new dual-microphone-based signal extraction algorithm would allow a hearing 
aid user to select a desired signal in an acoustically cluttered environment for extraction and 
simultaneously filter out competing signals originating from a separate source. 

7. Give a complete detailed description of the best model for practicing your invention with an emphasis on the new 
features or improvements over the known methods. Provide data or other evidence of the feasibility or operability of the 
invention. Attach any visual material that may be available, such as: sketches, graphs, drawings or photographs. 

The algorithm was developed and tested using computer simulation. The general description of 
the signal acquisition system/algorithm is shown in a schematic diagram [Figure 1]. The system 
comprises a front end section consisting of a microphone that is connected to a frequency analyzer and a 
delay line; a separate analyzer and delay line are connected to each of the two microphones. With the 
microphones aligned toward the loudspeaker emitting the desired signal (assume this is the reference 
direction or 0°), the front end section is used to obtain an estimate of the azimuth (or direction) of the 
noise source relative to the reference. Once the direction of the noise source has been determined, a 
processor employing a proprietary mathematical algorithm is used to extract the acoustic signal at 0°, and 
at the same time removing the unwanted sound originating from the noise source. 

We have simulation data to indicate that a speech signal having a peak-to-peak amplitude of 0.9 
relative units [Figure 2], as emitted from a loudspeaker located at 0\ can be accurately extracted [Figure 
4] when the signal is embedded in a babble noise of equal intensity [Figure 3] originating from a second 
loudspeaker at equal distance but located at a different azimuth (at 60° on one side of the signal 
loudspeaker). The algorithm performs essentially equally well [Figure 6] for an angular separation of 
only 2° and when the noise intensity is 30 dB above the intensity of speech signal (in this case the peak- 
to-peak analog amplitude of the noise is about 15.0 relative units and that of signal is 0.9 units - see 
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Figure 5). Although the signal as extracted by the algorithm in the latter case is somewhat noisier than 
that obtained under the earlier test condition, the intelligibility of the speech is excellent at least for 
individuals with normal hearing. 

In addition to the computer simulation, tests have been carried out using actual microphones and 
loudspeakers in a laboratory (see item #10). These tests have provided evidence that the algorithm works 
effectively in a physical setting. 

8. Provide details of any experiments that have been conducted, including any that have been failures, as well as all that 
have been successes. 

See item #7 and item #10 for successes. 



9. Indicate any alternate embodiments, procedures or methods of construction for the invention. 

Alternate procedures and methods of construction will be evaluated as appropriate. 

10. Describe the development status (concept, laboratory tested, prototype, etc.). Indicate what further development may 
be necessary. 

The algorithm as derived from the computer simulation has been tested in the laboratory (i.e., a 
semi-anechoic room). One loudspeaker (LI) is used to emit speech signal and a second loudspeaker (L2) 
is used to produce babble noise [see Figure 1], Two microphones, with an inter-microphone distance of 
15 cm, are placed in front of LI at a distance of about 3 feet. The L2 is placed at different azimuths but at 
the same distance from the mid-point of the microphones. The outputs of the microphone preamplifiers 
are fed into a computer (Sun Sparc 20 or a Pentium 150 Hz) and processed according to the algorithm. 
The algorithm requires that the LI be placed directly ahead of the microphones and that the angular 
azimuth of L2 be accurately estimated. The angular estimation under the present system requires 20-30 
iterations, or >100 msec. Once the estimation of spatial location is acquired, the computation leading to 
the extraction of the signal is fairly rapid. 

To make a functional hearing aid device using the algorithm invented, we need to incorporate the 
algorithm into a small integrated-circuit chip. Miniaturization of the hearing aid device is thereby 
important for its ultimate feasibility. Also, as noted above, the computational time using Madab is long. 
There is thus a need to improve its efficiency such that the device can perform on a real time basis. This 
performance can be readily achieved if the location of interfering noise is stationary. However, if the 
location of the noise source moves constantly, there is a need to streamline the algorithm for a more 
expeditious estimation of the angular separation between the signal and noise sources. 

1 1 . Identify the grant(s), contract(s), or other source(s) of funds contributing to the conception and development of the 
invention. Indicate if the invention was made as part of ones assigned duties or with the use of university facilities or 
services. 
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The funding for this research comes from in-house support from the Director of the Beckman 
Institute of Advanced Science and Technology, using research facilities at the Institute. With the 
exception of Dr. Chen Liu, the invention was not part of assigned duties of members of the research 
team. 



1 2. If work on the invention is to be continued, indicate the sources of funding and the nature of the work. 

This work will be continued in order to refine and improve the capability and performance of the 
hearing aid technology. For this, a partial support in the form of an institutional postdoctoral fellowship 
has been obtained for Dr. Chen Liu. Additional support will be sought from external funding agencies. 

13. Give names and date of any publication(s) or abstract(s), oral or written, as well as any proposed publications which 
mention or describe the invention. Separate general publications from those which disclose the critical elements of the 
invention. 

None of the results have been published or reported in any scientific meeting. A manuscript 
describing the design and the performance of this algorithm will be written up in the near future in 
preparation for submission for publication in a scientific journal. 



1 4. Give chronology of principal events in conception and development: 

(a) Earliest conception date (Is there substantiating evidence, such as a notebook or a witness?) 

The conception of the idea for the project was made during the Several members of 

Cone Tech (Dr. Nelson Levy [Chief Executive Officer] and Dr. Eric Coles [Vice President]) are 
witnesses of the discussion of the plan (record is available in the form of a letter from Dr. Levy dated 
January 31, 1994). The formulation of the research team took place in August of 1994. There is a 
written record to this effect in the form of a letter to Dr. Jiri Jonas (Director, Beckman Institute of 
Advanced Science and Technology) requesting seed money for this project. 

(b) Date of disclosure (oral or written) to other persons and names of such persons. 

A written disclosure was first made to Dr. Jiri Jonas (Director, Beckman Institute of Advanced 
Science and Technology) on February 9, 1996. 

(c) First written record and availability of such records. 

Initial written record was made in September of 1995 following the joining of Dr. Chen Liu to 
our research team. The records are available as files on the Sparc-20 workstation, and as hard copy 
records in the laboratory. 



(d) Dates and results of first test of invention and first successful test. 
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First positive results were obtained in early 1996 and the results were reported to Dr. Jiri Jonas 
(Director, Beckman Institute of Advanced Science and Technology) on February 9, 1996. 



15. If others are known to have tried for the function or to have achieved the results of the invention and failed, describe 
and attach full particulars of those efforts. 

None to the best of our knowledge. 



Section 16 Redacted 
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17. If the invention or products made in accord with the invention may have been publicly disclosed, sold, offered for sale 
or licensed or used, then provide the dates of the first of any acts that, as the invention or products made by the invention, 
might be argued to be a: 

(a) Publication-first date and place: Not applicable 



(b) Offer for Sale-first date and place: Not applicable 



(c) Offer to License-first date and place: Not applicable 



(d) Public or commercial use-first date and place: Not applicable 



(e) Non-laboratory or non-secret experimental use-first date and place: Not applicable 
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First Creator 



Date: 



1 8 . Each page of the disclosure should be signed and dated by the creators), and then read and 
signed by a witness, or witnesses who understands it, using the following statement: 



DISTRIBUTION Prepare and distribute 5 copies of the completed Invention Disclosure Form as 
follows: 

1 copy for your file 

1 copy to Unit Executive Officer 

1 original and 2 copies to the Research and Technology Management Office, 417 Swanlund 
Admin. Building, 601 E. John Street, Champaign, DL 61820 (MC-304) 



FIGURE CAPTIONS 



Figure 1 Schematic diagrams of the experimental setup. A. Top view of the spatial 
arrangement of the components. B. Diagram of the computational scheme. 



Figure 2 Speech signal in the form of a sentence (2 sec long) emitted from a 
loudspeaker at 0°. 

Figure 3 Speech signal embedded in babble noise of equal intensity (0 dB SNR) 
originated from a neighboring loudspeaker which is spatially separated (60° away) from the 
signal source is highly noisy. 

Figure 4 Signal recovered by the neural algorithm is essentially perfect - as clean as 
the original speech signal (shown in Figure 2). 

Figure 5 Speech signal embedded in babble noise of high intensity (-30 dB SNR) 
originated from a neighboring loudspeaker which is only 2° away from the signal source is 
extremely noisy. Note that the scale in the ordinate is 20x higher than in Figure 1. 



Figure 6 Signal retrieved by the neural algorithm is excellent, albeit slighdy noisier 
than the original speech signal. The signal is clearly intelligible (it sounds like having a 
weakly audible hiss added to the speech signal shown in Figure 2) 
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A. Top view of the spatial arrangement of the components. 
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B. Diagram of the computational scheme. 
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Noisy Signal (case 4: s0b60, 0 dB) 
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Signal: System Output (case 4: s0b60, 0 dB) 
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Figure 5 



Signal: System Output (case 1 : s0b2, -30 dB) 




time(s) 



Figure 6 



File#T96057 



APPENDIX 

Described herein is the information processing scheme of a Hearing Aid System which is 
designed to extract a desired signal, emitted at one location, embedded in noise emitted at a 
second location. In usual practice the desired signal would be the speech of a person 
directly in front of the listener, while the noise could be any interfering sound including the 
speech of another individual. This description includes the algorithmic components 
beginning with the digitization of the sound at two locations and ending with the 
reconstruction of the desired speech signal. 

(i) This Hearing Aid System utilizes two identical microphones as receivers (denoted #1 in 
Figure 1 A); these are separated from one another by a fixed distance and positioned in such 
a way that the sound source emitting the desired signal is directly ahead (i.e., on-axis) of 
the mid-point of the microphone pair. The other sound source is off-axis at a different 
sound direction. Signals as picked up by the two microphones, x^t) and *^(f), are fed 

to two processing channels. Here L and R denote left and right processing channels, 
respectively, and p denotes the time frame of the short-term spectral analysis (see below). 

(ii) The signals as picked up by the microphones are filtered to prevent aliasing and 
digitized by analog-to-digital converters (#2 in Figure 1 A). The digital versions of the 
signals are designated as x^fk) and x Rp (k) , respectively, where the index k refers to the 

discrete time at which the samples are taken. 

(iii) The digitized signals are transformed into the frequency domain by means of short- 
term spectral analysis (#3 in Figure 1 A) across the entire audible frequency range. This 
process can be realized in practice by the Discrete Fourier Transform (DFT), or by means 
of a filter bank. The transformed outputs are complex signal amplitudes, X^(m) 
andX^(m), evaluated at discrete frequencies / m , (m=l, M). 

(iv) For each frequency, the complex signal amplitudes from the two channels are fed into a 
pair of delay-lines (#4 in Figure 1 A), each of which has an even number N of delay units 
(and stores N+l values including the current value). Conceptually, the signals from the left 
microphone are propagated from left to right, while those from the right microphone are 
propagated from right to left. An acoustic signal originating from any direction will be in 
phase at one specific locus along the length of the dual delay-line. The values of the time 
delays are assigned a priori such that the acoustic space in front of the two microphones is 
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divided uniformly into Af+ 1 directions in its azimuth and each sound azimuth is uniquely 
mapped to one location along the dual delay-line. 

(v) Each signal pair X^(m) and X Rp (m) in the dual delay-line is input to a computational 
unit (#5 in Figure 1 A and IB), which performs the following computational algorithm: 

X p \ m ) = f ^ f r for i < — 

exp[-y2^T l .+-»+Tj^ 2 



or 



t f t ior i ^ — 

;2^(^ +1 +-.+T i .,)/ m j-exp[-;2^(T Jv . i+2 +-+^ +1 )/ ni j 2 



The locations on the dual delay-lines are both indexed from left (/=1) to right (i=N+l). 
The t, are the values of the time delays. 

(vi) At the outputs of the computational units, an average of the energy is derived for each 
location (0 across the frequency bins m = 1 M (#6 in Figure 1A): 

(vii) A time average of the output of X p l) , over P most recent spectral-analysis time 
frames, is then computed as follows: 

where y p are empirically determined weighting factors. The noise source localization unit 
(#7 in Figure 1A) makes an estimate of the azimuth of the noise source by finding the 
location of the global minimum X (ia ^ } of X (0 along the dual-delay-line (output of #6 in 
Figure 1A): 

F^ = min[F>] 
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(viii) The noise source localization unit controls the input selection (#8 in Figure 1 A) of the 
speech reconstruction module (#9 in Figure 1A), to pinpoint the f noise -th column of the 
output matrix of processing unit (#5 in Figure 1A), X^\m), for which the noise is 
maximally canceled. Thus, X p ^\m) provides the best estimate, symbolized by S p (m) 9 
of the spectrum S p (m) of the desired signal. 

(ix) The speech reconstruction module (#9 in Figure 1A) converts the signal spectrum 
estimate S p (m) to the time domain: 

s p (k)&S p (m) 

(x) The digital version of the speech signal s p (k) is then converted into its analog form 
[s p (t)] by means of a digital-to-analog converter (#10 in Figure 1A). 
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(A) Block diagram of the system. 
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(B) Details of the 2-delay-line and accompanying calculation units. 



Extraction of Signal Embedded in Noise 



What follows is consideration of separation of two sound sources over a single frequency 
(denoted by 0)). By means of Fourier transform, the separation of two complex-sound sources 
can be accomplished. 

The source of desired signal is assumed to be straight ahead; while the interfering sound (or 
noise) source is off-axis. Two omnidirectional microphones are employed to detect the signal and 
noise. The outputs of these microphones are filtered, digitized, Fourier transformed, and fed into a 
pair of delay-lines (see Appendix for the arrangement of the paired delay lines). The tapped- 
outputs of the two delay lines are paired as shown in Fig. 1. Given the azimuthal position of the 
source of the desired signal, the components of the desired signal should be in phase at the outputs 
0! and 0 2 at the midpoint of the delay lines. This corresponds to position I on the delay line 
(Fig. 1). Similarly, the components of the interfering sound are in phase at the outputs 0 3 and 0 4 
at a different position marked II in Fig.l (see Appendix for finding this position). The phase 
distance between the positions I and II is assumed to be A . 

We denote the desired signal at position I at an arbitrary instant t as 

and the interfering sound at position II as 

V*^. (2) 

where 0 O is phase angle of the desired signal at position I, and 0 X is phase angle of the interfering 
sound at position n. The outputs 0, and # 2 of the delay lines at the midpoint I at the same instant 
t are, respectively, 

^^(A^+A^^)^. (3) 

and 

^-(V^+A^' 1 "^)^- (4) 

The outputs <f> y and <f> 4 of the delay lines at position II at the instant / are, respectively, 

03=(V ( ' O " A) + A^y^ (5) 

and 

0 4 = (V y(<?o+A) + A^')^- (6) 



The factor e m can be ignored since it is the same not only for both the desired and the 
interfering sound, but also for the outputs at the different positions on the delay lines as long as the 
outputs are taken at the same time instant t. Thus the amplitude as well as the initial phase of the 
desired signal at instant t can be estimated by 
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A je< _ 04-03 



(7) 



This estimation can be obtained provided the position of the source of interfering sound (A) is 
known or can be accurately determined. 
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Fig.l. Schematic diagram of signal extraction algorithm. 
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Derivation of global minima for determining the direction(s) of sound 

source(s) 



Two figures are shown here to illustrate the concept of global minima (or maxima) 
that is used for determining the direction(s) of sound source(s). These figures are derived 
from a new algorithm which represents an important improvement over the original 
algorithm. The improvement is in the capacity to determine the directions of both the 
desired signal and the interfering sound. With this improvement, we essentially remove the 
needs to align the microphone pair toward the desired signal. Instead, the direction of the 
desired signal can be any arbitrary direction close to the on-axis. In the situation where 
there are two sound sources, the sound originating from a location closer to the on-axis 
will be treated as the desired signal whereas the sound originating from off-axis will be 
treated as noise (or interfering sound), or vice versa. 

Figure 1 shows the 3-dimensional plot of values of the equivalent of inverse of 
energy [as used in the original algorithm], from the 18th spectral-analysis time frame. The 
jc-axis is the index (0 of the position on the dual delay-line, y-axis is the index (m) of each 
frequency bin, and z-axis indicates the energy of each frequency component at each 
position on the dual delay-line. The locations of the peaks of local maxima (see Figure 2 for 
global maxima) on the dual-delay-line indicate the azimuthal positions of both the desired 
signal and the interfering sound. 

Figure 2 displays the average of the inverse of the energy derived for each location 
(0 across the frequency bins (see Figure 1). The peaks represent the global maxima which 
correspond to the locations of the sound sources in free-field. The third global maxima on 
the far right is an artifact that, for the practical purpose, can be ignored. 
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I bwhee!er@umc.edu,10:50 AM 10/25/9.. .,sdme notes : 1_ 

'Date:- Tiw, 25" Octl994 10:50:11 "^0500 
' X-Sender: wheeler@uxl.cso.uiuc;edu 
Content-Type: text/plain; charset=="us-ascii n 

To: feng@uxl.cso.uiuc.edu, Charissa Lansing <crl@uiuc.edii>, yxz@enterprise.iijp.muc.edu 
From: bwheelei@uhicedu 
. Subject: some notes 

Al, Charissa, and Yunxin, 

This is only a start, but I thought I'd better get you something rather 
than nothing. 

Bruce 



Intelligent Hearing Aid 
Notes and Outline 
Bruce Wheeler 
10/19/94 

Preamble: 

Our current working rationale is as follows: 

While there has been tremendous progress in the signal processing 
associated with hearing aids, including filters with gains which are 
programmable, adaptive, and frequency band specific, there are two types of 
information which are underexploited and which should be the basis, either 
singly or in combination, for a new generation of devices. These are: 

1 . spatial information, gained from multiple microphone receivers, 
especially with directional selectivity, with new strategies for optimal 
physical placement and appropriate signal processing 

2. speedTunit specific information, which can be detected by digital 
pattern recognition and used to control signal gain in order to enhance the 
parts of speech most important for improved intelligibility by the hearing 
impaired ^a<£^c& 

These observations have support in the literature, which we are actively 
exploring, and lead to a plan for research spanning: 

1. the physiological bases for hearing deficits, especially as it relates 
toaging 

2. the psychoacoustical documentation of these deficits, 

3. the evaluation of prescriptive remedies which include both new hearing 
aids and strategies for use with visual information (e.g. lip reading), 

4. a combination of strategies for microphone placement and signal 
processing for directionality and noise cancellation 

5. signal processing for speech unit recognition coupled to enhanced filtering 

6. an architecture for a hearing aid chip, which allows flexible 
development of both spatial and speech component dependent amplification 

7. microminiaturization for ultimate feasibility of an in the ear or in the 
canal hearing aid. 

Background: 

1. State of the art in analog and digital hearing aids 

2. Understanding of the physiology of hearing deficits, especially as 
related to aging 

3. Review of what is known about increasing intelligibility of speech with 
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Bruce C. Wheeler, Associate Professor 
Electrical and Computer Engineering Department, 
Neuroscience Program, and Bioengineering Faculty 
Beckman Institute, University of Illinois at Urbana-Champaign 
405 N. Mathews, Urbana IL 61801; 
217-333-3236; FAX: 217-244-5180 or 217-244-8371 
email: bwheeler@uiuc.edu 
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Beckman Institute 



December 27, 1994 



TO: 



Bruce Wheeler 
Charissa Lansing 
Yunxin Zhao 



FROM: 



Al Feng 




Funding from Bl 



Good news! Margarita Ham just informed me that our request for equipment 
has been approved for the full amount. I should be getting the official notification from 
her shortly. 

We can now order all the equipment on our list. I have already told Bob Penka 
that we can contribute $2K toward the software license. We should order the rest as 
soon as we can. We do not have the experimental room assigned to us yet but I hope 
to hear from Sarah shortly. 



i r-penka@uiuc.edu, 2:06 PM 1/27/95 ...,ESPS/waves+/HTK train finally leaves 1 

Date: Fri, 27 Jan 1995 14:06:01 -0600 
X-Sender: penka@uxl.cso.uiuc.edu 
Mime-Version: 1.0 

To: morgan@cogsci.uiuc.edu, feng@uxl.cso.uiuc.edu, rbargar@ncsa.uiuc.edu, 

yxz@ifp.uiuc.edu.r-sousa 
From: r-penka@uiuc.edu 

Subject: ESPS/waves+/HTK train finally leaves the station! 

I have started the paperwork to order for ESPS/waves+/HTK. Before 
Entropies can ship the software I must tell them: 

1) the platforms we have and the media we need 

2) the number of licenses we need 

3) the identify of the support person in each participating unit who will be 
entitled to contact Entropies for software support. Entropies expects one 
support person per contributing unit , but this might be negotiable. I named 
the following units as contributors: 

Beckman Institute 

College of Liberal Arts and Sciences 

Department of Electrical and Computer Engineering 

NCSA 

Department of Spanish, Italian and Portugese 
Department of Linguistics 
Everyone within these units will be licensed to use ESPS/waves+/HTK. 

To enable me to supply this information to Entropies would you please tell me: 

A) the name of the support person for the unit you represent. This person will 
have access to Entropies techincal support. To eliminate ambiguity for those 
of you in Beckman, assume the following pairings of names and units: 
Bargar - NCSA 

Zhao - Computer and Electrical Engineering 

Feng - Beckman Institute 

Morgan - Linguistics 

Sousa - Spanish, Italian and Portugese 

We could name a sixth person (for LAS) but I'm at a loss for how to identify 
that person Jerry Morgan, could you suggest someone? 

B) The number of licenses your (as defined above) unit will need. 

C) the media your unit can handle and the platforms on which it will run 
ESPS/waves+/HTK. Ken Nelson of Entropies tells me that ESPS/waves+/HTK is 
available only for: 

Sun/4 or SPARCstations running Solaris or Sim/ OS 

SGI Indy and Indigo workstations running system versions 4.05F - 5.2 

HP 9000 running HP/UX 

DEC Alpha workstations running OSF/1 

DECstation 3100 or 5100 running Ultrix 

Media available: 4-mm tape 
8-mm tape 

QIC tape cartridge (1/4 inch tape) 

Thank you 
Robert Penka 
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r-penka@uiuc.edu, 3:20 PM 2/14/95 ...,ESPS/waves+/HTK 



X-Sender: penka@uxl.cso.uiuc.edu 
Mime-Version: 1.0 

Date: Tue, 14 Feb 1995 15:20:31 -0600 

To: l-haken@uiuc.edu, feng@uxl.cso.uiuc.edu, rbargar@ncsa.uiuc.edu 
From: r-penka@uiuc.edu 
Subject: ESPS /waves* /HTK 

Before Entropic software will ship ESPS/waves+/HTK I must supply them with 
the information requested below. Entropic asks that I tell them everything 
we need up front so that they can include everything in a single shipment. 

1) Please tell me the platforms on which you plan to run the software. 
Here are the available platforms: 

SUN workstations (SUN/OS) 
SUN workstations (Solaris) 

SGI Lady and Indigo workstations (system versions 4.05 through 5.2) 
DEC Alpha workstations (OSF/1) 
HP 9000/700 (HP/UX) 
DECstation (Ultrix) 

2) For each platform you will need, please identify the media you require. 
The available formats are: 

4mm DAT 
8mm exabyte 
1/4" cartridge 

3) Please identify the number of servers you will have and the number of 
licenses each server will manage. E.g. (license server #1, 15 simultaneous 
users), (license server #2, 25 simultaneous users), .... 

4) Please identify a technical support person for your use of 
ESPS/waves+/HTK. Entropic technical support will accept incoming calls 
from this person. [The contract permits us to name one person (must be an 
employee, not a student) per contributing department. It's quite likely 
that they will relax this rule and permit us to name one person per site. 
Give the names of the persons you would like named but, just in case, 
identify the primary person you want named if Entropic will give you only 
one slot. 

I would appreciate a prompt reply. Entropic has received the signed 
license agreement and the Purchase Order. They will ship as soon as I give 
them this information. 
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August 25, 1994 



Mr. Chen Liu 

Department of Biomedical Engineering 
Technion - Israel Institute of Technology 
Technion City, Haifa 32000 
Israel 



Dear Mr. Chen: 

I read with great interest your letter of July 21, 1994 in which you inquired about a 
postdoctoral position in my laboratory. Your research interest matches mine almost perfectly. 
Students and postdocs in my laboratory are presently engaged in physiological studies aiming 
toward the understanding of physiological mechanisms that underlie coherent perception of sounds 
in "noisy" environment, i.e., similar to a "cocktail party". Several colleagues of mine from the 
Beckman Institute for the Advanced Science and Technology (my home base) and I have also begun 
collaborative work to pursue research on designing intelligent hearing aid devices. I therefore think 
you would fit in well with our research programs. Further, I am in a position to offer you 
postdoctoral salary, at least for your first year. 

In looking over your CV, I notice that you did not list the names of references. As is 
customary for postdoctoral applications, however, I would appreciate receiving three letters of 
recommendation from professors who can provide frank assessments of your scientific 
qualifications as well as personal characters. I would also welcome their comments regarding your 
productivity as a graduate student or research associate. Finally, to further assist me in the 
evaluation, can you please send me reprints of your past publications and copy of manuscript(s) 
that describes your present dissertation work? 

I look forward to receiving these materials in the mail, or through fax (217-244-51 80). 

Sincerely, 



Albert S. Feng 

Professor of Physiology & Bioengineering 
Tel: 217-333-1734 

Email: FENG@UXl.CSO.UIUC.EDU 
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fECHNION - ISRAEL INSTITUTE OF TECHNOLOGY 
Oeoartnieirt of Blo-Medlcal Engineering 

The Jidto Silver Institute of Blo-Medlcal Engineering Sciences ^ ^ ^ 

T^RS^ 

Department of Physiology and Biophysics 
524 Bunill Hall 
407 South Goodwin Avenue 
Uibana,IL 61801 USA 

Dear Professor Feng, 

My passport has just been replaced. The information in the new passport is given below. 

Family Name: ^ 

„. „ — Chen 

30«*« 1964,Tta#«,P*.adn. 

Marital Status: Mamed 
Citizenship & Country of 

Legal Permanent Residence: f^^ife 
^P Cn tT#. 2806748 

Sg Place: Tel-Aviv The .Embassy of P-R-China in Israel 

*ExpirationDate: February 14, 2000 

Only the items labeled with asterisks are changed. My wife's personal information remains 
the same as I sent to you on October 19, 1994. 

Family Name: °J 

January 1969, Yunnan. PJL China 

Marital Status: Mamed 
Qtizenship & Country of 

Legal Permanent Residence: P £ Vjuna 

Passport*: 21 .? 

Kumg Place: Tianjin, PJL Chma 

Expiration Date: February 22, 1998 

To case you need further information, please contact me either by email 
iiuitoeltechnion.ac.U) or by fax (00972-4-234 131). I am sorry for tins change and 
the inconvenience incurred. 

With best regards, 
ChenUu 



Technlon City. Haifa 32 000, Israel. Tel. : (04) 239431 ,294130 : in ,32 000 nam >T i>»en T^p 

TELEX : 46406 TECON IL : FAX. : 972-4-234131 



Chen Liu, Re: schedule change 



To: Chen Liu <Uu@biomed.technion.ac.il> 
From: feng@uxl.cso.uiuc.edu 
Subject: Re: schedule change 
Cc: ASF 



Dear Mr. Liu: 



Happy New Year to you too! 

I can understand the need for a schedule change. It happens quite often for our students. A delay of one 
month does not pose a problem (but if it goes far beyond one month we may have to reconsider our offer). 

With best wishes.. 

Albert Feng 



>Dear Professor Feng, 
> 

>First of all, I would like to take this opportunity to wish you a happy 

>Chinese Spring Festival. 

> 

>I am writing to you due to a quite unexpected change in my plans. My 
>superviser, Prof. S. Sideman, has changed his Sabbatical schedule and, 
>despite my request otherwise, he has postponed my final examination. 
>Therefore, I will only be able to come to the US around late August, 1995. 
>I am very sorry for this development. 
> 

>I hope that this will not unduly affect my plans to come to the US for my 

>post-doctoral work. 

> 

>Also, for your information, my Chinese passport is being renewed. I shall 
>advise Ms. M. Ham of my new passport number as soon as I receive it. 
> 

>I am very sorry for the inconvenience that this change in my exam schedule 
>may incur, but hope that you understand and look forward to beginning my 
>work with you soon. Please let me know how this schedule looks to you, 
>and I thank you in advance for your understanding in the matter. 
> 

>With best regards, 
>Chen Liu 



Printed for feng@uxl^o.uiuc 
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November 29, 1995 



TO : Hearing Aid Research Faculty 

FROM: AlFeng 

RE: Draft for proposal for the Critical Research Initiative 

Attached please find a draft for proposal that is to be submitted to the CRI for obtaining 
support for our project. 



Specific Aims (Draft) 

Over 28 million Americans have hearing impairments that restrict their ability to 
communicate. Of these, about 5 million employ hearing aid devices to improve their ability to hear 
and to communicate. A survey of hearing impaired subjects whowear hearing aid devices indicates 
that, in spite of the advances in the electronic technology, only 58% of them found the current 
generation of hearing aid devices to be adequate for their needs. The limited satisfaction of hearing 
aids is in part attributed to the wider range of acoustic environments in which the hearing impaired 
subjects dwell, a situation that was made possible by successful miniaturization of hearing aid 
devices. The limitation comes from the fact that the hearing aid apparatus usually amplifies all 
sounds including the desired signal as well as the competing background noise (or unwanted signals). 

A collaborative research team at the Beckman Institute has launched an interdisciplinary 
effort to design and construct intelligent hearing aid devices that selectively amplify speech sound 
(i.e., the signal) embedded in the background noise originating from different sectors of auditory 
space. Selective speech processing in the normal hearing subject depends on the ability of the 
nervous system to focus on sounds emanating from a narrow sector of auditory space. The desired 
speech signal can be successfully deciphered even in the presence of intense background noise so 
long as the origins of the speech signal and noise are spatially separated. 

The fundamental approach is to use neurally inspired scheme to develop a highly directional 
signal acquisition system which can effectively acquire desired signal originating from a small sector 
of auditory space such that interfering sounds emanating from other sectors of auditory space are 
deemphasized. Once the speech signal is captured, it would be appropriately amplified and its 
speech content enhanced, using the state-of-the-art speech enhancement algorithms, to improve the 
speech intelligibility. This research has two specific aims. 

Aim #1 is to test the hypothesis that an effective signal acquisition (SA) system for acquiring a 
signal in noisy background can be achieved by a neurally inspired system. The real-time SA system 
shall consist of strategically-placed microphones which can be directed into a segment of auditory 
space for picking up speech sounds originating from that sector of auditory space. To achieve high 
directionality of the microphone system, signals picked up by the two microphones shall be 
processed using neurally inspired algorithms known to be highly effective. For this, the desired 
signal will be decomposed into its Fourier components, processed, and recomposed to restore to its 
original form. 

Approach : 

1. Employ different placements of microphones and different neural algorithms to optimize signal 
acquisition and noise cancellation - (Pis: Liu, Feng, O'Brien, Wheeler, and TBA). 

2. Evaluation of the effectiveness of the SA system in normal listening subjects - (Pis: Bilger, 
Gupta, and TBA). 

Aim #2 is to test the hypothesis that an intelligent hearing aid (IHA) system for effective speech 
recognition in noisy background can be reconstructed by combining the multiple-microphone based 
SA system (described in Aim #1) with: (a) speech enhancement algorithms (i.e., speech-unit 
amplification and/or filtering algorithms), (b) visually-based lip reading mechanism. 



Approach : 

1 . Test different speech enhancement algorithms electronically - (Pis: Lansing, Zhao, and TBA). 

2. Test the effectiveness of IHA devices in normal subjects and hearing-impaired subjects of various 
etiologies and different age groups - (Pis: Bilger, Gupta, and TBA). 

Upon completion of the design and testing of the real time system, we will focus on 
microminiturizing the IHA devices for ultimate feasibility as practical hearing assistive devices. The 
miniaturization portion of the project would include designing radio-transmission remote-control- 
system, packaging, battery, etc. Additional expertise will be recruited from the Microelectronics 
Center (e.g., Steve Kang) to complete this phase of the project. 

List of investigators (collaborators): 

Robert Bilger Professor of Speech and Hearing 

Albert S. Feng Professor of Beckman Institute & Molecular & Integrative Physiology 

Prahlad Gupta Beckman Fellow (starting date in the BI: January 1997) - He will be a part- 
time researcher on this project 
Charissa Lansing Assistant Professor of Beckman Institute & Speech and Hearing 
Chen Liu Postdoctoral Fellow at the Beckman Institute 

William O 5 Brien Professor of Electrical and Computer Engineering 
Bruce Wheeler Associate Professor of Beckman Institute & Electrical and Computer Eng. 
Yun-Xin Zhao Assistant Professor of Beckman Institute & Electrical and Computer Eng. 

TB As Graduate Research Assistants - To be announced 

Budget : 

Postdoctoral salary for Dr. Chen Liu for two years (8/96 to 8/98) $ 52,000 

Salary for one TBA graduate research assistant for two years (8/96 to 8/98) $ 28,000 

Salary for one TBA graduate research assistant for one year (8/97 to 8/98) $ 14,000 

Electronic parts and supplies $ 6,000 



Total amount requested for two years $ 1 00,000 

Budget Justifications : 

This collaborative project is being pursued at the Beckman Institute under the support of Jiri 
Jonas. His support covers the initial purchase of testing equipment and the first year salary for Dr. 
Chen Liu, a postdoc having extensive experience with the design of hearing aid devices. Sound 
generation and testing equipment, and a workstation along with appropriate interface hardware and 
software, were purchased earlier this year. These are installed in a sound isolated booth located in 
assigned laboratory space at the Beckman Institute. 

Design of hearing prosthetic devices is of high priority for the National Institute for 
Deafness and Other Communicative Disorders. When we have pilot data to demonstrate the 
feasibility of the project the chance of receiving a research grant is high. 



G 




Huerta, Lynn, 1:15 PM 2712/96.. .,RE: SBIR and STTR 



Subject: SBIR and STTR 

Date: Monday, February 12,4996 10:40AM 
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Hi, Lynn. Congratulations for the new addition to your family! I am pleased 
to hear from Amy that both the baby and the mother are doing fine. 

I tried to call you this morning but you were away from your desk. So I 
decided to communicate by email. I write to ask if you would be so kind to 
send me the brochures and application packages for SBIR and STTR. 

I informed Amy during the ARO meeting in Tampa that I am leading a group of 
researchers at the Beckman Institute to develop hearing aid devices that 
can help the hearing-impaired subjects to hear in complex acoustic 
environments with multi talkers (such as a cocktail party). The algorithm 
is neurally inspired and is looking very promising at least in our computer 
simulation. Our efforts are presently supported by the Beckman Institute 
(We did not think we had a fair chance if we were to compete with the high 
power groups of researchers before we got our feet wet and established our 
ground). The projest has progressed exceedingly well; the simulation 
results show that a speech signal can be faithfully extracted even when a 
babble noise of equal intensity (S/N of 0 dB) is broadcasted from an 
off-axis. We are carrying out a detailed evaluation of the algorithm. With 
our initial rapid success, we are encouraged that this direction is a great 
way to go for developing devices that can function well in complex noisy 
conditions. 

To further this research, our team would benefit greatly if we can obtain 
financial support from the NIDCD. Amy indicated that such research projects 
have high priorities, and the success can bolster the credibility and 
publicity of the Institute, and that NIDCD would be interested in 
supporting promising research in this area. She suggested that we explore 
the SBIR (or STTR which, according to Bruce Wheeler, one of my 
collaborators, may be a better way to go) to reduce the risk of premature 
extensive exposure of the project before it has a chance to prove its 
overall capability. I believe we have a strong team: Bruce Wheeler, William 
O'Brien and Yun-Xin Zhao (three faculty members from the Electrical and 
Computer Engineering department specializing in signal processing and 
speech recognition), Charrissa Lansing and Bob Bilger (two professors from 
Speech and Hearing), myself and a postdoc who specialized in hearing aid 
research during his PhD dissertation. Additionally, once the algorithm has 
been proven and tested, several faculty from our renowned Microelectronics 
Center have expressed interest and support for miniturizing the devices. 

I would appreciate it if you can forward to me the brochures and 
application packages for SBIR and STTR. 

With best regards, 

Al 



Printed for feng@uxL^ 
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